Repository Information
- Repository Name: Dockerfile to extract Gravitational Wave data from the ESCAPE datalake
- Record ID: 7025564
- Repository URL: https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer
- Tool Type: SoftwareSourceCode
- Version: 1.0.0
Results
| Indicator | generated by | evidence | Value | Run Status | Check passed |
|---|---|---|---|---|---|
| https://w3id.org/everse/i/indicators/codemeta_completeness | Codemeta Completeness Tool | Codemeta completeness = 36.5%, minimal threshold to consider this check to pass is set to 20.0%. Found ['codeRepository', 'programmingLanguage', 'applicationCategory', 'downloadUrl', 'memoryRequirements', 'operatingSystem', 'processorRequirements', 'releaseNotes', 'storageRequirements', 'author', 'copyrightHolder', 'dateCreated', 'dateModified', 'datePublished', 'funder', 'keywords', 'license', 'version', 'description', 'identifier', 'name', 'identifier', 'name', 'maintainer', 'funding', 'issueTracker', 'readme'] keys in codemeta file. | 0.36486486486486486 | ✅ | ✅ |
| https://w3id.org/everse/i/indicators/codemeta_completeness | Codemeta Completeness Tool | Codemeta completeness = 36.5%, minimal threshold to consider this check to pass is set to 50.0%. Found ['codeRepository', 'programmingLanguage', 'applicationCategory', 'downloadUrl', 'memoryRequirements', 'operatingSystem', 'processorRequirements', 'releaseNotes', 'storageRequirements', 'author', 'copyrightHolder', 'dateCreated', 'dateModified', 'datePublished', 'funder', 'keywords', 'license', 'version', 'description', 'identifier', 'name', 'identifier', 'name', 'maintainer', 'funding', 'issueTracker', 'readme'] keys in codemeta file. | 0.36486486486486486 | ✅ | ❌ |
| https://w3id.org/everse/i/indicators/doi_presence | SOMEF | Found DOI: 10.5281/zenodo.5742053 with confidence 100.0% from source https:///rhys.poulton/escape-datalake-shared-volume-writer/-/blob/main/codemeta.json | 10.5281/zenodo.5742053 | ⚠️ | ✅ |
| https://w3id.org/everse/i/indicators/codemeta_discrepancy | SOMEF, Codemeta Completeness Tool | Comparison value: 0.7205882352941176, Threshold: 0.5, Status: True | 0.7205882352941176 | ⚠️ | ✅ |
Results Output
https://w3id.org/everse/i/indicators/codemeta_completeness
- Status: CompletedActionStatus
- Value: 0.36
- Evidence: Codemeta completeness = 36.5%, minimal threshold to consider this check to pass is set to 20.0%. Found ['codeRepository', 'programmingLanguage', 'applicationCategory', 'downloadUrl', 'memoryRequirements', 'operatingSystem', 'processorRequirements', 'releaseNotes', 'storageRequirements', 'author', 'copyrightHolder', 'dateCreated', 'dateModified', 'datePublished', 'funder', 'keywords', 'license', 'version', 'description', 'identifier', 'name', 'identifier', 'name', 'maintainer', 'funding', 'issueTracker', 'readme'] keys in codemeta file.
- Output:
{"pass": true, "value": 0.36486486486486486, "codemeta_dict": {"@context": "https://doi.org/10.5063/schema/codemeta-2.0", "@type": "SoftwareSourceCode", "license": "https://spdx.org/licenses/CC-BY-4.0", "maintainer": {"@type": "Person", "@id": "https://orcid.org/ 0000-0003-2049-520X", "givenName": "Rhys", "familyName": "Poulton", "email": "poulton@ego-gw.it", "affiliation": {"@type": "Organization", "name": "European Gravitational Observatory"}}, "copyrightHolder": {"@type": "Organization", "name": "European Gravitational Observatory"}, "codeRepository": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer", "readme": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/blob/main/README.md", "dateCreated": "2021-11-30", "datePublished": "2021-11-30", "dateModified": "2021-11-30", "memoryRequirements": "500MB", "processorRequirements": "None", "storageRequirements": "5Gb", "downloadUrl": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/archive/main/escape-datalake-shared-volume-writer-main.tar.gz", "issueTracker": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/issues", "name": "Dockerfile to extract Gravitational Wave data from the ESCAPE datalake", "version": "1.0.0", "identifier": "10.5281/zenodo.5742053", "description": "This is a container to extract Gravitational Wave (GW) data from the datalake using Rucio and feed 1 second GW frames to the GW pipelines.", "applicationCategory": "Gravitational Waves", "funding": "ESCAPE 824064", "funder": {"@type": "Organization", "name": "European Union's Horizon 2020 research and innovation programme"}, "keywords": ["EGO-VIRGO"], "programmingLanguage": ["Python", "bash"], "operatingSystem": ["Centos 7"], "releaseNotes": "First release", "author": [{"@type": "Person", "@id": "https://orcid.org/ 0000-0003-2049-520X", "givenName": "Rhys", "familyName": "Poulton", "email": "poulton@ego-gw.it", "affiliation": {"@type": "Organization", "name": "European Gravitational Observatory"}}]}, "threshold": 0.2}
https://w3id.org/everse/i/indicators/codemeta_completeness
- Status: CompletedActionStatus
- Value: 0.36
- Evidence: Codemeta completeness = 36.5%, minimal threshold to consider this check to pass is set to 50.0%. Found ['codeRepository', 'programmingLanguage', 'applicationCategory', 'downloadUrl', 'memoryRequirements', 'operatingSystem', 'processorRequirements', 'releaseNotes', 'storageRequirements', 'author', 'copyrightHolder', 'dateCreated', 'dateModified', 'datePublished', 'funder', 'keywords', 'license', 'version', 'description', 'identifier', 'name', 'identifier', 'name', 'maintainer', 'funding', 'issueTracker', 'readme'] keys in codemeta file.
- Output:
{"pass": false, "value": 0.36486486486486486, "codemeta_dict": {"@context": "https://doi.org/10.5063/schema/codemeta-2.0", "@type": "SoftwareSourceCode", "license": "https://spdx.org/licenses/CC-BY-4.0", "maintainer": {"@type": "Person", "@id": "https://orcid.org/ 0000-0003-2049-520X", "givenName": "Rhys", "familyName": "Poulton", "email": "poulton@ego-gw.it", "affiliation": {"@type": "Organization", "name": "European Gravitational Observatory"}}, "copyrightHolder": {"@type": "Organization", "name": "European Gravitational Observatory"}, "codeRepository": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer", "readme": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/blob/main/README.md", "dateCreated": "2021-11-30", "datePublished": "2021-11-30", "dateModified": "2021-11-30", "memoryRequirements": "500MB", "processorRequirements": "None", "storageRequirements": "5Gb", "downloadUrl": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/archive/main/escape-datalake-shared-volume-writer-main.tar.gz", "issueTracker": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/issues", "name": "Dockerfile to extract Gravitational Wave data from the ESCAPE datalake", "version": "1.0.0", "identifier": "10.5281/zenodo.5742053", "description": "This is a container to extract Gravitational Wave (GW) data from the datalake using Rucio and feed 1 second GW frames to the GW pipelines.", "applicationCategory": "Gravitational Waves", "funding": "ESCAPE 824064", "funder": {"@type": "Organization", "name": "European Union's Horizon 2020 research and innovation programme"}, "keywords": ["EGO-VIRGO"], "programmingLanguage": ["Python", "bash"], "operatingSystem": ["Centos 7"], "releaseNotes": "First release", "author": [{"@type": "Person", "@id": "https://orcid.org/ 0000-0003-2049-520X", "givenName": "Rhys", "familyName": "Poulton", "email": "poulton@ego-gw.it", "affiliation": {"@type": "Organization", "name": "European Gravitational Observatory"}}]}, "threshold": 0.5}
https://w3id.org/everse/i/indicators/codemeta_discrepancy
- Status: PotentialActionStatus
- Value: 0.72
- Evidence: Comparison value: 0.7205882352941176, Threshold: 0.5, Status: True
- Output:
{"pass": true, "value": 0.7205882352941176, "threshold": 0.5, "results": {"completeness_1": 0.36486486486486486, "codemeta_version_1": "codemeta-2.0", "codemeta_version_2": "codemeta-3.0", "completeness_2": 0.20270270270270271, "missing_keys_1": ["runtimePlatform", "targetProduct", "applicationSubCategory", "fileSize", "installUrl", "permissions", "softwareHelp", "softwareRequirements", "softwareVersion", "supportingData", "citation", "contributor", "copyrightYear", "editor", "encoding", "fileFormat", "producer", "provider", "publisher", "sponsor", "isAccessibleForFree", "isPartOf", "hasPart", "position", "sameAs", "url", "relatedLink", "givenName", "familyName", "email", "affiliation", "address", "", "", "softwareSuggestions", "contIntegration", "buildInstructions", "developmentStatus", "embargoDate", "referencePublication", "creator", "", "", "", "endDate", "roleName", "startDate"], "missing_keys_2": ["runtimePlatform", "targetProduct", "applicationCategory", "applicationSubCategory", "fileSize", "installUrl", "memoryRequirements", "operatingSystem", "permissions", "processorRequirements", "releaseNotes", "softwareHelp", "softwareRequirements", "softwareVersion", "storageRequirements", "supportingData", "citation", "contributor", "copyrightHolder", "copyrightYear", "datePublished", "editor", "encoding", "fileFormat", "funder", "producer", "provider", "publisher", "sponsor", "version", "isAccessibleForFree", "isPartOf", "hasPart", "position", "sameAs", "url", "relatedLink", "givenName", "familyName", "email", "affiliation", "address", "", "", "softwareSuggestions", "maintainer", "continuousIntegration", "buildInstructions", "developmentStatus", "embargoEndDate", "funding", "referencePublication", "creator", "review", "reviewAspect", "reviewBody", "endDate", "roleName", "startDate"], "existing_keys_1": ["codeRepository", "programmingLanguage", "applicationCategory", "downloadUrl", "memoryRequirements", "operatingSystem", "processorRequirements", "releaseNotes", "storageRequirements", "author", "copyrightHolder", "dateCreated", "dateModified", "datePublished", "funder", "keywords", "license", "version", "description", "identifier", "name", "identifier", "name", "maintainer", "funding", "issueTracker", "readme"], "existing_keys_2": ["codeRepository", "programmingLanguage", "downloadUrl", "author", "dateCreated", "dateModified", "keywords", "license", "description", "identifier", "name", "identifier", "name", "issueTracker", "readme"], "differences": {"codeRepository": {"value_in_1": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer", "value_in_2": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/"}, "applicationCategory": {"value_in_1": "Gravitational Waves", "value_in_2": null}, "downloadUrl": {"value_in_1": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/archive/main/escape-datalake-shared-volume-writer-main.tar.gz", "value_in_2": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/branches"}, "memoryRequirements": {"value_in_1": "500MB", "value_in_2": null}, "operatingSystem": {"value_in_1": ["Centos 7"], "value_in_2": null}, "processorRequirements": {"value_in_1": "None", "value_in_2": null}, "releaseNotes": {"value_in_1": "First release", "value_in_2": null}, "storageRequirements": {"value_in_1": "5Gb", "value_in_2": null}, "author": {"value_in_1": [{"@type": "Person", "@id": "https://orcid.org/ 0000-0003-2049-520X", "givenName": "Rhys", "familyName": "Poulton", "email": "poulton@ego-gw.it", "affiliation": {"@type": "Organization", "name": "European Gravitational Observatory"}}], "value_in_2": [{"@type": "Person", "email": "poulton@ego-gw.it", "name": null}]}, "copyrightHolder": {"value_in_1": {"@type": "Organization", "name": "European Gravitational Observatory"}, "value_in_2": null}, "datePublished": {"value_in_1": "2021-11-30", "value_in_2": null}, "funder": {"value_in_1": {"@type": "Organization", "name": "European Union's Horizon 2020 research and innovation programme"}, "value_in_2": null}, "license": {"value_in_1": "https://spdx.org/licenses/CC-BY-4.0", "value_in_2": {"identifier": "https://spdx.org/licenses/https://spdx.org/licenses/CC-BY-4.0", "spdx_id": "https://spdx.org/licenses/CC-BY-4.0"}}, "version": {"value_in_1": "1.0.0", "value_in_2": null}, "description": {"value_in_1": "This is a container to extract Gravitational Wave (GW) data from the datalake using Rucio and feed 1 second GW frames to the GW pipelines.", "value_in_2": ["This is a container to extract Gravitational Wave (GW) data from the datalake using Rucio and feed 1 second GW frames to the GW pipelines."]}, "identifier": {"value_in_1": "10.5281/zenodo.5742053", "value_in_2": ["10.5281/zenodo.5742053"]}, "maintainer": {"value_in_1": {"@type": "Person", "@id": "https://orcid.org/ 0000-0003-2049-520X", "givenName": "Rhys", "familyName": "Poulton", "email": "poulton@ego-gw.it", "affiliation": {"@type": "Organization", "name": "European Gravitational Observatory"}}, "value_in_2": null}, "funding": {"value_in_1": "ESCAPE 824064", "value_in_2": null}, "issueTracker": {"value_in_1": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/issues", "value_in_2": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer//issues"}}, "equivalences": {"programmingLanguage": ["Python", "bash"], "runtimePlatform": null, "targetProduct": null, "applicationSubCategory": null, "fileSize": null, "installUrl": null, "permissions": null, "softwareHelp": null, "softwareRequirements": null, "softwareVersion": null, "supportingData": null, "citation": null, "contributor": null, "copyrightYear": null, "dateCreated": "2021-11-30", "dateModified": "2021-11-30", "editor": null, "encoding": null, "fileFormat": null, "keywords": ["EGO-VIRGO"], "producer": null, "provider": null, "publisher": null, "sponsor": null, "isAccessibleForFree": null, "isPartOf": null, "hasPart": null, "position": null, "name": "Dockerfile to extract Gravitational Wave data from the ESCAPE datalake", "sameAs": null, "url": null, "relatedLink": null, "givenName": null, "familyName": null, "email": null, "affiliation": null, "address": null, "": null, "softwareSuggestions": null, "contIntegration": null, "buildInstructions": null, "developmentStatus": null, "embargoDate": null, "referencePublication": null, "readme": "https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/blob/main/README.md", "creator": null, "endDate": null, "roleName": null, "startDate": null}}}
Logs
Log File: docs/records/7025564/7025564_somef_log.txt
2026-03-05 10:10:32,916 somef_tool.py:140 INFO Running SOMEF on repository: https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer
2026-03-05 10:10:41,396 somef_utils.py:43 INFO SOftware Metadata Extraction Framework (SOMEF) Command Line Interface
CODEMETA PARSER - Processing file: /tmp/tmpgtyg_3d1/repo/escape-datalake-shared-volume-writer-main/codemeta.json
CODEMETA PARSER - Source: https:///rhys.poulton/escape-datalake-shared-volume-writer/-/blob/main/codemeta.json
Saving json data to docs/records/7025564/7025564_somef.json
Success
2026-03-05 10:10:41,396 somef_utils.py:45 ERROR 05-Mar-26 10:10:38-DEBUG-Starting new HTTPS connection (1): git.ligo.org:443
05-Mar-26 10:10:39-DEBUG-https://git.ligo.org:443 "GET /api/v4/projects HTTP/1.1" 200 2908
05-Mar-26 10:10:39-INFO-git.ligo.org is GitLab.
05-Mar-26 10:10:39-INFO-Loading Repository https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer Information....
05-Mar-26 10:10:39-INFO-Downloading https://git.ligo.org/api/v4/projects/rhys.poulton%2Fescape-datalake-shared-volume-writer
05-Mar-26 10:10:39-DEBUG-Starting new HTTPS connection (1): git.ligo.org:443
05-Mar-26 10:10:39-DEBUG-https://git.ligo.org:443 "GET /api/v4/projects/rhys.poulton%2Fescape-datalake-shared-volume-writer HTTP/1.1" 200 545
05-Mar-26 10:10:39-INFO-Project_id: 11614
05-Mar-26 10:10:39-INFO-Downloading https://git.ligo.org/api/v4/projects/11614
05-Mar-26 10:10:39-DEBUG-Starting new HTTPS connection (1): git.ligo.org:443
05-Mar-26 10:10:39-DEBUG-https://git.ligo.org:443 "GET /api/v4/projects/11614 HTTP/1.1" 200 545
05-Mar-26 10:10:39-INFO-Getting releases from: https://git.ligo.org/api/v4/projects/11614/releases?page=1&per_page=100
05-Mar-26 10:10:39-DEBUG-Starting new HTTPS connection (1): git.ligo.org:443
05-Mar-26 10:10:39-DEBUG-https://git.ligo.org:443 "GET /api/v4/projects/11614/releases?page=1&per_page=100 HTTP/1.1" 200 2
05-Mar-26 10:10:39-INFO-Response: 200
05-Mar-26 10:10:39-WARNING-Not releseases found.
05-Mar-26 10:10:39-DEBUG-Starting new HTTPS connection (1): git.ligo.org:443
05-Mar-26 10:10:39-DEBUG-https://git.ligo.org:443 "GET /rhys.poulton/escape-datalake-shared-volume-writer/-/raw/master/LICENSE HTTP/1.1" 404 2613
05-Mar-26 10:10:39-INFO-Repository information successfully loaded.
05-Mar-26 10:10:39-INFO-Downloading https://git.ligo.org/rhys.poulton/escape-datalake-shared-volume-writer/-/archive/main/escape-datalake-shared-volume-writer-main.zip
05-Mar-26 10:10:39-DEBUG-Starting new HTTPS connection (1): git.ligo.org:443
05-Mar-26 10:10:39-DEBUG-https://git.ligo.org:443 "GET /rhys.poulton/escape-datalake-shared-volume-writer/-/archive/main/escape-datalake-shared-volume-writer-main.zip HTTP/1.1" 200 17747
05-Mar-26 10:10:40-INFO-Extracting information using headers
/builds/escape-ossr/rs_quality_checks/.venv/lib/python3.11/site-packages/somef/header_analysis.py:112: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df['Content'].replace('', np.nan, inplace=True)
05-Mar-26 10:10:40-INFO-Labeling headers.
/builds/escape-ossr/rs_quality_checks/.venv/lib/python3.11/site-packages/somef/header_analysis.py:224: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:
df["col"][row_indexer] = value
Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
data['Group'].iloc[0] = ['unknown']
/builds/escape-ossr/rs_quality_checks/.venv/lib/python3.11/site-packages/somef/header_analysis.py:230: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:
df["col"][row_indexer] = value
Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
data['Group'].iloc[0] = np.NaN
05-Mar-26 10:10:40-INFO-Header information extracted.
05-Mar-26 10:10:40-INFO-Splitting text into valid excerpts for classification
05-Mar-26 10:10:40-INFO-Extraction of bibtex citation from readme completed.
05-Mar-26 10:10:40-INFO-Text Successfully split.
05-Mar-26 10:10:40-INFO-Classifying excerpts for the category description
05-Mar-26 10:10:40-INFO-Checking thresholds for classified excerpts.
05-Mar-26 10:10:40-INFO-All excerpts below the threshold have been removed.
05-Mar-26 10:10:40-DEBUG-Starting new HTTPS connection (1): git.ligo.org:443
05-Mar-26 10:10:40-DEBUG-https://git.ligo.org:443 "GET /rhys.poulton/escape-datalake-shared-volume-writer/wiki HTTP/1.1" 302 100
05-Mar-26 10:10:40-INFO-Completed extracting regular expressions
Log File: docs/records/7025564/7025564_codemeta_completeness_tool_log.txt
2026-03-05 10:10:41,398 codemeta_completeness_tool.py:72 INFO [codemeta completeness tool] Running Codemeta Completeness Tool on record ID: 7025564