data.webtools.download_tarball¶
Function ยท Source
mdnc.data.webtools.download_tarball(
user, repo, tag, asset,
path='.', mode='auto', token=None, verbose=False
)
Download an online tarball from a Github release asset, and extract it automatically.
This tool is used for downloading the assets from github repositories. It would:
- Try to detect the data info in public mode;
- If fails (the Github repository could not be accessed), switch to private downloading mode. The private mode requires a Github OAuth token for getting access to the file.
- The tarball would be sent to pipeline and not get stored.
Now supports gz
, bz2
or xz
format, see tarfile to view the details.
Tip
The mechanics of this function is a little bit complicated. It is mainly inspired by the following codes:
Arguments¶
Requries
Argument | Type | Description |
---|---|---|
user | str | The Github owner name of the repository, could be an organization. |
repo | str | The Github repository name. |
tag | str | The Github release tag where the data is uploaded. |
asset | str | The github asset (tarball) name (including the file name postfix) to be downloaded. |
path | str | The extracted data root path. Should be a folder path. |
mode | str | The mode of extraction. Could be 'gz' , 'bz2' , 'xz' or 'auto' . When using 'auto' , the format would be guessed by the posfix of the file name in the link. |
token | str | A given OAuth token. Only when this argument is unset, the program will try to find a token from enviornmental variables. To learn how to set the token, please refer to mdnc.data.webtools.get_token . |
verbose | bool | A flag, whether to show the downloaded size during the web request. |
Examples¶
Example 1
1 2 3 |
|
Get xconfigs-u20-04.tar.xz: 3.06kB [00:00, 263kB/s]
Example 2
1 2 3 |
|
data.webtools: A Github OAuth token is required for downloading the data in private repository. Please provide your OAuth token:
Token:****************************************
data.webtools: Tips: specify the environment variable $GITTOKEN or $GITHUB_API_TOKEN could help you skip this step.
Get test-datasets-1.tar.xz: 216B [00:00, 217kB/s]
Last update: March 14, 2021