reference
This part of the project documentation focuses on
an information-oriented approach. Use it as a
reference for the technical implementation of the
file-clerk
project code.
A collection of functions for dealing with files and file content.
This was a library I created for previous projects that deal with files and file paths in order to get code from files, lists of files in project folders, file extensions, and allows us to capture just files of a particular type. I also developed my projects on Windows OS, so these functions were designed to work with the file paths on Windows, Mac, and Linux (Windows is the one with backslashes - wacky, I know.).
Typical usage example:
extension = get_file_type("path/to/file.js")
code_string = file_to_string("path/to/file.html")
project_path = "path/to/project"
all_project_files = get_all_project_files(project_path)
just_css_files = get_all_files_of_type(project_path, "css")
clear_extra_text(my_text)
Removes line returns and extra spaces from my_text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
my_text |
str
|
text which may include line returns or extra space. |
required |
Returns:
Name | Type | Description |
---|---|---|
stripped_text |
str
|
text without any line returns or additional spaces. |
Source code in file_clerk/clerk.py
245 246 247 248 249 250 251 252 253 254 255 256 257 |
|
delete_file(filepath)
deletes file in path but only if it exists
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath |
str
|
The file location |
required |
Source code in file_clerk/clerk.py
49 50 51 52 53 54 55 56 57 58 59 |
|
file_exists(file_path)
Returns True or False: whether file in path exists.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
The file location |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True or False: True if file exists False if not |
Source code in file_clerk/clerk.py
36 37 38 39 40 41 42 43 44 45 46 |
|
file_to_string(path)
Returns contents of file as a string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to a file using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
file_text |
str
|
The contents of the file in utf-8 string format. |
Source code in file_clerk/clerk.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
|
get_all_files_of_type(dir_path, filetype)
returns all files of a particular type from a directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dir_path |
str
|
The path to a directory using Posix format (forward slashes e.g. path/to/file.ext) |
required |
filetype |
str
|
An extension in the form of a string (without the dot (e.g. html, css, js, etc.) |
required |
Returns:
Name | Type | Description |
---|---|---|
files |
list
|
A list of all html, css, and javascript files |
Source code in file_clerk/clerk.py
193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
|
get_all_project_files(dir_path)
returns a list of all files from the directory in the path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dir_path |
str
|
The path to a directory using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
files |
list
|
A list of all html, css, and javascript files |
Source code in file_clerk/clerk.py
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 |
|
get_file_name(path)
returns the name of the file in the path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to a file using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
filename |
str
|
The name of the file (with extension) |
Source code in file_clerk/clerk.py
130 131 132 133 134 135 136 137 138 139 140 141 |
|
get_file_type(path)
returns the extension of the file in the path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to a file using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
extension |
str
|
The extension of the file type (without) |
str
|
the dot (eg. html, js, css, pdx, png) |
Source code in file_clerk/clerk.py
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
get_full_path_string(path)
returns absolute path to file in relative path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The file location using the Posix format (forward/slashes) |
required |
Returns:
Name | Type | Description |
---|---|---|
full_path |
Path Object
|
will be a WindowsPath (if Windows) or PosixPath (if Mac or Linux) |
Source code in file_clerk/clerk.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|
get_linked_css(contents_str)
returns a list of linked CSS files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
contents_str |
str
|
HTML code from a file in string format. |
required |
Returns:
Name | Type | Description |
---|---|---|
filenames |
list
|
A list of all filenames extracted from CSS link tags. Note: no external stylesheets will be included (only local files). |
Source code in file_clerk/clerk.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
|
get_path_list(path)
Returns a list of each path part using slash as separator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The file location using the Posix format (forward/slashes) |
required |
Returns:
Name | Type | Description |
---|---|---|
path_list |
list
|
A path of each folder in a path, with the file at the end. Example: path/to/file.ext will be ["path", "to", "file.ext"] |
Source code in file_clerk/clerk.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
|
remove_tags(element)
Removes all HTML tags from another tag's contents
Parameters:
Name | Type | Description | Default |
---|---|---|---|
element |
str
|
the contents of a tag as a string form which may or
may not have extra tags (in particular inline tags, such as :code: |
required |
Returns:
Name | Type | Description |
---|---|---|
tagless_content |
str
|
the contents of the tag minus any inner tags. |
Source code in file_clerk/clerk.py
230 231 232 233 234 235 236 237 238 239 240 241 242 |
|
split_into_sentences(contents)
Returns a list of each sentence from the text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
contents |
str
|
A string of text (typically from a tag) that most likely has punctuation. |
required |
Returns:
Name | Type | Description |
---|---|---|
sentences |
list
|
A list of each sentence from the text each in string format |
Source code in file_clerk/clerk.py
214 215 216 217 218 219 220 221 222 223 224 225 226 227 |
|
write_csv_file(filepath, data_list)
Create a CSV file using a 2-D list.
This function will create a CSV file using the data_list (the contents of the file) using the filepath relative to the directory you set (most likely your project directory).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath |
str
|
name of the full path to your file name in
relation to the project folder.
Example: |
required |
data_list |
list
|
a 2D list that will be your CSV file contents. NOTE: the first row will be your headers. |
required |
Source code in file_clerk/clerk.py
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 |
|