{"id":602,"date":"2021-12-26T21:50:39","date_gmt":"2021-12-26T21:50:39","guid":{"rendered":"https:\/\/ml-gis-service.com\/?p=602"},"modified":"2021-12-26T21:51:46","modified_gmt":"2021-12-26T21:51:46","slug":"toolbox-get-all-files-of-a-specific-type-from-a-given-directory-in-python","status":"publish","type":"post","link":"https:\/\/ml-gis-service.com\/index.php\/2021\/12\/26\/toolbox-get-all-files-of-a-specific-type-from-a-given-directory-in-python\/","title":{"rendered":"Toolbox: Get all files of a specific type from a given directory in Python"},"content":{"rendered":"\n<p>Here is a common problem: we want to get a list of paths to the files with a specific extension and we have only the directory path that contains those files. This task is so frequent that if you work with data processing you should have this function always at hand. Function is really simple and it uses only <code>os<\/code> module from the core Python.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import os\n\n\ndef get_files_paths(directory: str, file_type=None) -> list:\n    \"\"\"\n    Function prepares list of paths to the files within a given directory.\n    :param directory: (str)\n    :param file_type: (str) default=None, all files are selected.\n    :return: (list)\n    \"\"\"\n    if file_type is not None:\n        files = [os.path.join(directory, x) for x in os.listdir(directory) if x.endswith(file_type)]\n        return files\n    else:\n        return [os.path.join(directory, x) for x in os.listdir(directory)]<\/pre>\n\n\n\n<p>The main logic behind this function is hidden within list comprehension. First, we list all files in the directory with <code>os.listdir()<\/code> function. Then we join directory path and filename with <code>os.path.join()<\/code> function. The condition within the list comprehension <code>if x.endswith(file_type)<\/code> checks every file in directory if it&#8217;s suffix is the same as <code>file_type<\/code> parameter. This method is slightly upgraded to work with all kind of files if we don&#8217;t want to limit it to the specific file type. Then we left default <code>file_type<\/code> value as <code>None<\/code>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to get a list of paths to the files with a specific extension in Python.<\/p>\n","protected":false},"author":1,"featured_media":603,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,3,17],"tags":[136,133,134,135,132,131,7],"class_list":["post-602","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-engineering","category-python","category-scripts","tag-extension","tag-files","tag-get-files-from-directory","tag-get-specific-files","tag-list-of-files","tag-os","tag-python"],"_links":{"self":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts\/602","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/comments?post=602"}],"version-history":[{"count":2,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts\/602\/revisions"}],"predecessor-version":[{"id":605,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts\/602\/revisions\/605"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/media\/603"}],"wp:attachment":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/media?parent=602"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/categories?post=602"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/tags?post=602"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}