Summary
Generate data names in a directory/database structure by walking the tree top-down or bottom-up. Each directory/workspace yields a tuple of three: directory path, directory names, and file names.
Discussion
The Python os module includes an os.walk function that can be used to walk through a directory tree and find data. os.walk is file based and does not recognize database contents such as geodatabase feature classes, tables, or rasters. arcpy.da.Walk can be used to catalog data.
Syntax
Walk (top, {topdown}, {onerror}, {followlinks}, {datatype}, {type})
Parameter | Explanation | Data Type |
top | The top-level workspace that will be used. | String |
topdown | If topdown is True or not specified, the tuple for a directory is generated before the tuple for any of its workspaces (workspaces are generated top-down). If topdown is False, the tuple for a workspace is generated after the tuple for all of its subworkspaces (workspaces are generated bottom-up). When topdown is True, the dirnames list can be modified in-place, and Walk will only recurse into the subworkspaces whose names remain in dirnames. This can be used to limit the search, impose a specific order of visiting, or even to inform Walk about directories the caller creates or renames before it resumes Walk again. Modifying dirnames when topdown is False is ineffective, because in bottom-up mode the workspaces in dirnames are generated before dirpath itself is generated. (The default value is True) | Boolean |
onerror | Errors are ignored by default. The onerror function will be called with an OSError instance. The function can be used to report the error and continue with the walk or raise an exception to abort. Note:The file name is available as the filename attribute of the exception object. (The default value is None) | Function |
followlinks | By default, Walk does not walk into connection files. Set followlinks to True to visit connection files. (The default value is False) | Boolean |
datatype | The data type to limit the results returned. Valid data types are the following:
Multiple data types are supported if entered as a list or tuple.
(The default value is None) | String |
type | Feature and raster data types can be further limited by type.
Valid feature types are the following:
Valid raster types are:
Multiple data types are supported if entered as a list or tuple.
(The default value is None) | String |
Data Type | Explanation |
Generator | Yields a tuple of three that includes the workspace, directory names, and file names.
Note:Names in the lists include only the base name; no path components are included. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name). |
Code sample
Use the Walk function to catalog polygon feature classes.
import arcpy
import os
workspace = "c:/data"
feature_classes = []
walk = arcpy.da.Walk(workspace, datatype="FeatureClass", type="Polygon")
for dirpath, dirnames, filenames in walk:
for filename in filenames:
feature_classes.append(os.path.join(dirpath, filename))
Use the Walk function to catalog raster data. Any rasters in a folder named back_up will be ignored.
import arcpy
import os
workspace = "c:/data"
rasters = []
walk = arcpy.da.Walk(workspace, topdown=True, datatype="RasterDataset")
for dirpath, dirnames, filenames in walk:
# Disregard any folder named 'back_up' in creating list of rasters
if "back_up" in dirnames:
dirnames.remove('back_up')
for filename in filenames:
rasters.append(os.path.join(dirpath, filename))