The SCA Dependencies Finder is a Python-based tool designed to parse XML files, extract dependencies based on predefined filters, and categorize them. The extracted dependencies are saved in CSV format for further analysis.
- Extracts dependencies from XML-based files (e.g.,
.wsdl
,.xsd
,.xml
, etc.) - Filters dependencies based on specific XML elements and attributes
- Categorizes dependencies into different types (MDS, HTTP, FILE, LOCAL, and CUSTOM)
- Supports custom filtering options
- Outputs extracted data as CSV files
- Python 3.9+
- Install required dependencies using:
pip install -r requirements.txt
To process a directory of XML files and extract dependencies, run:
python app.py
By default, the script will use the configured input and output directories specified in config.py
.
You can specify custom input and output directories:
python app.py --input /path/to/xml/files --output /path/to/save/results
The config.py
file contains settings for:
- File Extensions: XML-based file types to process
- Elements & Attributes: XML tags and attributes used for dependency extraction
- Filters: Predefined filters for categorization (MDS, HTTP, FILE, etc.)
- Defaults: Input/output paths, file format, and logging settings
The extracted dependencies are saved in CSV format with filenames structured as:
dependencies_<CATEGORY>_<TIMESTAMP>.csv
Each CSV file contains the following columns:
file
- The relative path of the source XML fileelement
- The XML element containing the dependencyattribute
- The attribute in which the dependency is foundpath
- The extracted dependency value
The script logs its progress and errors in the console. Log levels can be adjusted in config.py
using:
logging.basicConfig(level=logging.INFO, format=DEFAULT.get("logging_format"))
This project is open-source and available under the MIT License.