API references¶
Submodules¶
Functions to fetch metadata about the available IBC datasets.
- ibc_api.metadata.fetch_dataset_db(data_type, metadata=None)¶
Fetch csv containing file-by-file information about the requested dataset.
Parameters¶
- data_typestr
what dataset to select, could be one of ‘volume_maps’, ‘surface_maps’, ‘preprocessed’, ‘raw’
- metadatadict, optional
dictionary object containing version info, dataset ids etc, by default None
Returns¶
- str
full path of the fetched file csv file
- ibc_api.metadata.fetch_metadata(file='datasets.json')¶
Fetch the metadata file from the IBC docs repo
Parameters¶
- filestr, optional
name of the file, by default “datasets.json”
Returns¶
- dict
json file loaded as a dictionary
- ibc_api.metadata.fetch_remote_file(file, remote_root='https://raw.githubusercontent.com/individual-brain-charting/api/main/src/ibc_api/data/', local_root='/home/runner/work/docs/docs/api/src/ibc_api/data')¶
Fetch a file from the IBC docs repo
Parameters¶
- filestr
name of the file to fetch
- remote_rootstr, optional
root link to wherever the file is stores, by default REMOTE_ROOT
- local_rootstr, optional
location to write the fetched file, by default LOCAL_ROOT
Returns¶
- str
full path of the fetched file
- ibc_api.metadata.select_dataset(data_type, metadata=None, version=None)¶
Select metadata of the requested dataset
Parameters¶
- data_typestr
what dataset to select, could be one of ‘volume_maps’, ‘surface_maps’, ‘preprocessed’, ‘raw’
- metadatadict, optional
dictionary object containing version info, dataset ids etc, by default None
- versionint, optional
version of the dataset to select, starts from 1, by default None
Returns¶
- dict
the metadata of latest version of the requested dataset
Raises¶
- KeyError
if the requested dataset is not found in the metadata
API to fetch IBC data from EBRAINS via Human Data Gateway using siibra.
- ibc_api.utils.download_data(db, n_jobs=2, save_to=None)¶
Download the files in a (filtered) dataframe.
Parameters¶
- dbpandas.DataFrame
dataframe with information about files in the dataset, ideally a subset of the full dataset
- n_jobsint, optional
number of parallel jobs to run, by default 2. -1 would use all the CPUs. See: https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html
- save_tostr, optional
where to save the data, by default None, in which case the data is saved in a directory called “ibc_data” in the current working directory
Returns¶
- pandas.DataFrame
dataframe with downloaded file names and times from the dataset
- ibc_api.utils.download_gm_mask(resolution=1.5, save_to=None)¶
Download the grey matter mask
Parameters¶
- resolutionfloat, optional
resolution of the mask, by default 1.5
- save_tostr, optional
where to save the mask, by default None
Returns¶
- save_asstr
path to the downloaded mask
- ibc_api.utils.filter_data(db, subject_list=['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15'], task_list=False)¶
Filter the dataframe to only include certain subjects and tasks.
Parameters¶
- dbpandas.DataFrame
dataframe with information about all files in the dataset
- subject_listlist, optional
list of subjects to keep, by default SUBJECTS, SUBJECTS contains all subjects from [“01”, “02”, “04”,…,”15”]
- task_listlist or bool, optional
list of tasks to keep, by default False
Returns¶
- pandas.DataFrame
dataframe with information about files corresponding to only include given subjects and tasks
- ibc_api.utils.get_file_paths(db, metadata={'preprocessed': [{'db_file': 'preprocessed_v1.csv', 'id': '3ca4f5a1-647b-4829-8107-588a699763c1', 'root': 'PreprocessedData_v1.0', 'version': 1}], 'raw': [{'db_file': '', 'id': '0e5b9c99-4cb9-4b93-960f-e2a7fe6a16dd', 'root': '', 'version': 1}, {'db_file': 'raw_v2.csv', 'id': 'a1c940cc-4777-417e-9326-dd6584d6c71f', 'root': 'v2.0', 'version': 2}, {'db_file': 'raw_v3.csv', 'id': '8ddf749f-fb1d-4d16-acc3-fbde91b90e24', 'root': 'v3.0', 'version': 3}], 'surface_maps': [{'db_file': 'surface_maps_v1.csv', 'id': 'ad04f919-7dcc-48d9-864a-d7b62af3d49d', 'root': 'resulting_smooth_maps_surface', 'version': 1}], 'volume_maps': [{'db_file': 'volume_maps_v1.csv', 'id': '07ab1665-73b0-40c5-800e-557bc319109d', 'root': 'resulting_smooth_maps', 'version': 1}, {'db_file': 'volume_maps_v2.csv', 'id': 'ad04f919-7dcc-48d9-864a-d7b62af3d49d', 'root': 'resulting_smooth_maps', 'version': 2}]}, save_to_dir=None)¶
Get the remote and local file paths for each file in a (filtered) dataframe.
Parameters¶
- dbpandas.DataFrame
dataframe with information about files in the dataset, ideally a subset of the full dataset
Returns¶
- filenames, list
lists of file paths for each file in the input dataframe. First list is the remote file paths and second list is the local file paths
- ibc_api.utils.get_info(data_type='volume_maps', save_to=None, metadata={'preprocessed': [{'db_file': 'preprocessed_v1.csv', 'id': '3ca4f5a1-647b-4829-8107-588a699763c1', 'root': 'PreprocessedData_v1.0', 'version': 1}], 'raw': [{'db_file': '', 'id': '0e5b9c99-4cb9-4b93-960f-e2a7fe6a16dd', 'root': '', 'version': 1}, {'db_file': 'raw_v2.csv', 'id': 'a1c940cc-4777-417e-9326-dd6584d6c71f', 'root': 'v2.0', 'version': 2}, {'db_file': 'raw_v3.csv', 'id': '8ddf749f-fb1d-4d16-acc3-fbde91b90e24', 'root': 'v3.0', 'version': 3}], 'surface_maps': [{'db_file': 'surface_maps_v1.csv', 'id': 'ad04f919-7dcc-48d9-864a-d7b62af3d49d', 'root': 'resulting_smooth_maps_surface', 'version': 1}], 'volume_maps': [{'db_file': 'volume_maps_v1.csv', 'id': '07ab1665-73b0-40c5-800e-557bc319109d', 'root': 'resulting_smooth_maps', 'version': 1}, {'db_file': 'volume_maps_v2.csv', 'id': 'ad04f919-7dcc-48d9-864a-d7b62af3d49d', 'root': 'resulting_smooth_maps', 'version': 2}]})¶
Fetch a csv file describing each file in a given IBC dataset on EBRAINS.
Parameters¶
- data_typestr, optional
dataset to fetch, by default “volume_maps”, one of [“volume_maps”, “surface_maps”, “raw”, “preprocessed”]
- save_asstr or None, optional
filename to save this csv as, by default None, if None saves as “ibc_data/available_{data_type}.csv”
Returns¶
- pandas.DataFrame
dataframe with information about each file in the dataset