You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BIDS (Binary Identification of Dependencies with Search). The BIDS project will deliver tooling to analyse ELF binaries and extract
key features for indexing and searching. The tooling to index these binary features in a search engine uses an inverted index.
NOTE that BIDS is not designed to detect the presence of malware; it is intended to help understand the scope of a binary and to support vulnerability analysis activities.
Alternatively, just clone the repo and install dependencies using the following command:
pip install -U -r requirements.txt
The tool requires Python 3 (3.9+). It is recommended to use a virtual python environment especially
if you are using different versions of python. virtualenv is a tool for setting up virtual python environments which
allows you to have all the dependencies for the tool set up in a single environment, or have different environments set
up for testing using different versions of Python.
The installation process installs 5 separate tools:
bids-analyser which analyses an ELF binary and extracts dependency and symbolic information into a JSON file.
bids-scan which analyses a set of ELF binaries in a directory.
bids-search which provides a CLI to search through a set of binaries to extract dependency and symbolic information.
sbom4bids which generates a Software Bill of Materials (SBOM) from a bids JSON file.
bids-ui which provides a user interface to the tools to analyse binary files, generating SBOMs and query the extracted information.
The tools can also be used as Python libraries.
It is recommended that additional utilities are also installed to support the analysis activity.
bids-analyser analyses a binary application in ELF format and extracts dependency, symbolic and call graph information into a JSON data stream
options: -h, --help show this help message and exit -V, --version show program's version number and exit
Input: -f FILE, --file FILE identity of binary file --description DESCRIPTION description of file --library-path LIBRARY_PATH path to search for library files --exclude-dependency suppress reporting of dependencies --exclude-symbol suppress reporting of symbols --exclude-callgraph suppress reporting of call graph --detect-version detect version of component
Output: -d, --debug add debug information -o OUTPUT_FILE, --output-file OUTPUT_FILE output filename (default: output to stdout)
Operation
The --file option is used to specify the binary file to be processed.
The --description option is used to provide a brief description of the binary being processed.
The --library-path option is used to provide the path for library files which are not located with the system library files. Multiple
locations can be specified as a list of paths spearated by ','. e.g. "/local/lib,/project/lib"
The --exclude-dependency, --exclude-symbol, and --exclude-callgraph option is used to disable
the capture of dependency, symbol or callgraph information respectively.
The --detect-version option is used to indicate that an attempt will be made to detect the version of each component. This is disabled by default.
NOTE that detecting component versions will require that the component binary is executed. This can be protected, to some extent, by the use of a sandbox;
bids-analyser will use the firejail sandbox if available although this can be overriden by setting the environment variable BIDS_SANDBOX to the pathname of the sandbox to be used.
The --output-file option is used to control the destination of the output generated by the tool. The
default is to report to the console but can be stored in a file (specified using --output-file option).
Output File Format
The output file is in JSON format. The content depends on the contents of the file and the specified command line options.
bids-scan analyses ELF binaries in a directory and extracts dependency and symbolic information
options: -h, --help show this help message and exit -V, --version show program's version number and exit
Input: --directory DIRECTORY directory to scan --pattern PATTERN files pattern (default is all files)
Output: -d, --debug add debug information -o OUTPUT, --output OUTPUT directory to store results
Operation
bids-scan analyses a set of ELF binaries in a directory. It is the equivalent to calling bids-analyser for each file. within the directory.
The --directory option is used to specify the binary file to be processed.
The --pattern option is used to provide a file matching pattern to limit the number of files to be processed. The default is for all ELF files to be processed.
The --output option is used to store the output generated by the tool. The generated files will be named based on the filename of the ELF file.
bids-search is used to search through a set of analysed ELF binaries to extract dependency and symbolic information.
The --initialise option is used to initialise the dataset. All data within an existing dataset is removed
The --index option is used to add data to the dataset. All valid JSON bids-analysis files within the directory are added to the dataset.
The --search option is used to query the dataset. The search query canb include boolean logic e.g. "libc AND lipng". The --verbose option is used to
return more detailed information related to the quert. The --results option is used to limit the number of seach results returned (the default is 10 results).
The --import and --export options are used to import a previously pre-populated dataset or export the current dataset to a file. These options are intended to be used to transfer a dataset between platforms.
Example operation
The following shows a typical user session.
The dataset is first initialised before adding a set of ELF analysis files in the directory samples/gdata.
To get details of the results, add the --verbose option. The number of results returned can also be specified using the --results option; to only
return the top result, set the number of results to 1.
Read/write access is required to a directory (~/.cache/bids/) for storing and reading the dataset. This can be overriden by the setting of an environment variable BIDS_DATASET.
Generates a Software Bill of Materials (SBOM) from a Bids JSON file
options: -h, --help show this help message and exit -V, --version show program's version number and exit
Input: -i INPUT, --input INPUT name of Bids file
Output: -d, --debug add debug information --sbom {spdx,cyclonedx} specify type of sbom to generate (default: spdx) --format {tag,json,yaml} specify format of software bill of materials (sbom) (default: tag) -o OUTPUT_FILE, --output-file OUTPUT_FILE output filename (default: output to stdout)
Operation
sbom4bids is used to generate a Software Bill of Materials (SBOM) for an analysed ELF binary.
The --input option is used to specify the name of the file contain the analysed data from the ELF binary, produced by bids-analyser.
The --sbom option is used to specify the format of the generated SBOM (the default is SPDX). The --format option
can be used to specify the formatting of the SBOM (the default is Tag Value format for a SPDX SBOM). JSON format is supported for both
SPDX and CycloneDX SBOMs).
The --output-file option is used to control the destination of the output generated by the tool. The
default is to report to the console but can be stored in a file (specified using --output-file option).
Return Values
The following values are returned:
0 - BOM generation process completed
1 - Error detected in generation process
Bids-UI
bids-ui provides a text-based user interface (TUI) to provide facilities to:
Analyse a binary
Generate a Software Bill of Materials (SBOM) for an analysed binary file
Search a set of analysed binaries for symbolic and dependecy information.
Note the user interface provides equivalent facilities to the CLI tools although not all options are available through the user interface.
Operation
The user interfaces is launched as follows:
bids-ui
Key bindings:
d is used to toggle between dark-mode (default) and light-mode
ctrl+p can be used to change the colour palette
q is used to terminate the application
Additional key bindings are context dependent.
A directory tree is shown on the left, with various buttons on the right. The Analsyse Binary File and Generate SBOM buttons are only available once a file has been selected in the directory tree.
Analyse Binary File
To save an analysis to a file, enter the name of the file in the output filename box. If no filename is specified, the generated analysis will be shown on the screen.
To return to the previous screen, select the Close button.
SBOM Generation
Changing the type of SBOM to CycloneDX will result in the available formats changing.
To save an analysis to a file, enter the name of the file in the filename box. If no filename is specified, the generated SBOM will be shown on the screen.
To return to the previous screen, select the Close button.
Query Database
The search term must be specified. The default number of results (10) can be overidden by specifying the required number of results. Selecting
the verbose reporting checkbox will result in more data being returned.
Additional key bindings are available to support searching within the results
/ is used to enter a search term.
escape is used to cancel a search
n is used to move to the next search item
p is used to move to the previous search item
An example of the results of a verbose report.
API
BIDs can be used as a library to analyse ELF binary files, store data in a dataset and to search for features within the dataset.
To analyse a binary file and store the results in a file, the following sequence of calls can be performed:
>)">frombids.analyserimportBIDSAnalyser analyser=BIDSAnalyser() # Analyse a binary analyser.analyse("test/test_assets/hello") # Get details about the file analyser.get_file_data() {'location': '/root/Documents/git_repo/BIDS/test/test_assets/hello', 'checksum': {'size': 15952, 'date': 'Tue Nov 19 11:13:36 2024', 'sha256': '4434a9af25d451e4a9f4515d418a270b99f2362326245a06523c921e564cde21', 'sha384': '8088e53ee015ef9af1656e0426c1678cdb69bfd4abfb2e5593dfee0e7d6b22a13cd19f47ac124e5d90c721e4680383b9', 'sha512': 'd11bc10ed1ed367753eb5050fa4d78a543c5f4a2c9c6ab7fcce2d5f0804a4722de91689b51cf91b11a698b7ee26ccab703ab143c91afca9427fde9550869e089', 'sha3-256': 'd4b4dc35397beeff05d247bd4c653b9b65162c747656321b827e7cc1d7f6a625', 'sha3-384': 'dec32ea35cc5b5d805d3d245463251e126d23b226085012f053b9c2207d7a1d74c925204d46518848a76396b515aa0cc', 'sha3-512': '8ca2c3817db6846b808a5367b08b8f9ed963bd308aaf0127a6fc4a23784c01f82948d6222689b761df65991807b5ad904538f756666239d94a8f70f15896bf75'}} # Identify the dependencies analyser.get_dependencies() ['libc.so.6'] # Examine the global symbols analyser.get_global_symbols() [['libc.so.6', 'GLIBC_2.34', '__libc_start_main'], ['libc.so.6', 'GLIBC_2.2.5', 'printf'], ['libc.so.6', 'GLIBC_2.2.5', '__cxa_finalize']] # ..and local symbols analyser.get_local_symbols() ['_ITM_deregisterTMCloneTable', '__gmon_start__', '_ITM_registerTMCloneTable'] # Store data in a file frombids.outputimportBIDSOutput output=BIDSOutput() output.create_metadata(analyser.get_file_data()) output.create_components( analyser.get_dependencies(), analyser.get_global_symbols(), analyser.get_callgraph(), local=analyser.get_local_symbols(), ) output.generate_output(<<Filename>>)
To add data to a dataset, put all the files to be added to the index in a directory and call index_files with the name of the directory.
frombids.indeximportBIDSIndexer dataset=BIDSIndexer() # Assume analysed files are in a directory dataset.index_files(<<Directory>>)
BIDS (Binary Identification of Dependencies with Search). The BIDS project will deliver tooling to analyse ELF binaries and extract key features and store these for indexing and searching using an inverted index. This project is sponsored by NLNET https://nlnet.nl/project/BIDS/.