Sp.4ML > Data Engineering  > Toolbox: Get all files of a specific type from a given directory in Python
Decorative image with bird sitting on stick, and with meadow in a background.

Toolbox: Get all files of a specific type from a given directory in Python

Here is a common problem: we want to get a list of paths to the files with a specific extension and we have only the directory path that contains those files. This task is so frequent that if you work with data processing you should have this function always at hand. Function is really simple and it uses only os module from the core Python.

import os

def get_files_paths(directory: str, file_type=None) -> list:
    Function prepares list of paths to the files within a given directory.
    :param directory: (str)
    :param file_type: (str) default=None, all files are selected.
    :return: (list)
    if file_type is not None:
        files = [os.path.join(directory, x) for x in os.listdir(directory) if x.endswith(file_type)]
        return files
        return [os.path.join(directory, x) for x in os.listdir(directory)]

The main logic behind this function is hidden within list comprehension. First, we list all files in the directory with os.listdir() function. Then we join directory path and filename with os.path.join() function. The condition within the list comprehension if x.endswith(file_type) checks every file in directory if it’s suffix is the same as file_type parameter. This method is slightly upgraded to work with all kind of files if we don’t want to limit it to the specific file type. Then we left default file_type value as None.

Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x