reference
This part of the project documentation focuses on
an information-oriented approach. Use it as a
reference for the technical implementation of the
file-clerk
project code.
A collection of functions for dealing with files and file content.
This was a library I created for previous projects that deal with files and file paths in order to get code from files, lists of files in project folders, file extensions, and allows us to capture just files of a particular type. I also developed my projects on Windows OS, so these functions were designed to work with the file paths on Windows, Mac, and Linux (Windows is the one with backslashes - wacky, I know.).
Typical usage example:
extension = get_file_type("path/to/file.js")
code_string = file_to_string("path/to/file.html")
project_path = "path/to/project"
all_project_files = get_all_project_files(project_path)
just_css_files = get_all_files_of_type(project_path, "css")
clear_extra_text(my_text)
Removes line returns and extra spaces from my_text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
my_text
|
str
|
text which may include line returns or extra space. |
required |
Returns:
Name | Type | Description |
---|---|---|
stripped_text |
str
|
text without any line returns or additional spaces. |
Source code in file_clerk/clerk.py
254 255 256 257 258 259 260 261 262 263 264 265 266 267 |
|
delete_file(filepath)
deletes file in path but only if it exists
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath
|
str
|
The file location |
required |
Source code in file_clerk/clerk.py
50 51 52 53 54 55 56 57 58 59 60 |
|
file_exists(file_path)
Returns True or False: whether file in path exists.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path
|
str
|
The file location |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True or False: True if file exists False if not |
Source code in file_clerk/clerk.py
37 38 39 40 41 42 43 44 45 46 47 |
|
file_to_string(path)
Returns contents of file as a string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
The path to a file using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
file_text |
str
|
The contents of the file in utf-8 string format. |
Source code in file_clerk/clerk.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
|
get_all_files_of_type(dir_path, filetype)
returns all files of a particular type from a directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dir_path
|
str
|
The path to a directory using Posix format (forward slashes e.g. path/to/file.ext) |
required |
filetype
|
str
|
An extension in the form of a string (without the dot (e.g. html, css, js, etc.) |
required |
Returns:
Name | Type | Description |
---|---|---|
files |
list
|
A list of all html, css, and javascript files |
Source code in file_clerk/clerk.py
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
get_all_project_files(dir_path)
returns a list of all files from the directory in the path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dir_path
|
str
|
The path to a directory using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
files |
list
|
A list of all html, css, and javascript files |
Source code in file_clerk/clerk.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
|
get_file_name(path)
returns the name of the file in the path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
The path to a file using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
filename |
str
|
The name of the file (with extension) |
Source code in file_clerk/clerk.py
131 132 133 134 135 136 137 138 139 140 141 142 |
|
get_file_type(path)
returns the extension of the file in the path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
The path to a file using Posix format (forward slashes e.g. path/to/file.ext) |
required |
Returns:
Name | Type | Description |
---|---|---|
extension |
str
|
The extension of the file type (without) |
str
|
the dot (eg. html, js, css, pdx, png) |
Source code in file_clerk/clerk.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
|
get_full_path_string(path)
returns absolute path to file in relative path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
The file location using the Posix format (forward/slashes) |
required |
Returns:
Name | Type | Description |
---|---|---|
full_path |
Path Object
|
will be a WindowsPath (if Windows) or PosixPath (if Mac or Linux) |
Source code in file_clerk/clerk.py
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
|
get_linked_css(contents_str)
returns a list of linked CSS files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
contents_str
|
str
|
HTML code from a file in string format. |
required |
Returns:
Name | Type | Description |
---|---|---|
filenames |
list
|
A list of all filenames extracted from CSS link tags. Note: no external stylesheets will be included (only local files). |
Source code in file_clerk/clerk.py
145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
|
get_path_list(path)
Returns a list of each path part using slash as separator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
The file location using the Posix format (forward/slashes) |
required |
Returns:
Name | Type | Description |
---|---|---|
path_list |
list
|
A path of each folder in a path, with the file at the end. Example: path/to/file.ext will be ["path", "to", "file.ext"] |
Source code in file_clerk/clerk.py
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
remove_tags(element)
Removes all HTML tags from another tag's contents
Parameters:
Name | Type | Description | Default |
---|---|---|---|
element
|
str
|
the contents of a tag as a string form which may or
may not have extra tags (in particular inline tags, such as :code:
|
required |
Returns:
Name | Type | Description |
---|---|---|
tagless_content |
str
|
the contents of the tag minus any inner tags. |
Source code in file_clerk/clerk.py
239 240 241 242 243 244 245 246 247 248 249 250 251 |
|
split_into_sentences(contents)
Returns a list of each sentence from the text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
contents
|
str
|
A string of text (typically from a tag) that most likely has punctuation. |
required |
Returns:
Name | Type | Description |
---|---|---|
sentences |
list
|
A list of each sentence from the text each in string format |
Source code in file_clerk/clerk.py
215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
|
write_csv_file(filepath, data_list)
Create a CSV file using a 2-D list.
This function will create a CSV file using the data_list (the contents of the file) using the filepath relative to the directory you set (most likely your project directory).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath
|
str
|
name of the full path to your file name in
relation to the project folder.
Example: |
required |
data_list
|
list
|
a 2D list that will be your CSV file contents. NOTE: the first row will be your headers. |
required |
Source code in file_clerk/clerk.py
270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 |
|