Accessing Controlled Data
Data submitted to NeMO falls into three categories:
- Public - data to be immediately distributed openly and freely to the wider research community,
- Embargo - data to be held back, or embargoed, until a specific date, at which point it will be released openly and freely to the wider research community,
- Restricted - Controlled access data to be distributed only to an approved group of users due to consent restrictions, e.g. human data, through the processes described here. Often restricted datasets contain a combination of private (raw reads, alignments) and public (counts, peaks) datatypes. In such instances, the dataset landing page or BDBag will provide direct access to public data, in addition to a link to this page for restricted data access instructions.
You can see the current list of projects available for access request by logging into the NIMH Data Archive and selecting Get Data > Request Access. Under NDA Controlled Access permissions Groups, you will find all BRAIN/NeMO restricted datasets currently available for access request.
In this document:
- Restricted access datasets currently available for request at NDA
- Requesting access through the NDA Approval Process
- Downloading Data
The NIMH Data Archive will review data access requests in consideration of the data use limitations on these datasets before approval. For further instructions on how to request and get access to these datasets, go to Requesting access through the NDA Approval Process.
|Dataset||Description||Data Use Limitations||BICCN Webpage||Dataset Page|
|BRAIN/NeMO: A multimodal atlas of human brain cell types: Patch-Seq||This study (NIH Grant 1U01MH114812-01; PI: Lein, Ed) is part of an international consortium approach using single cell transcriptomics, human cellular physiology and anatomy and neuronal modeling to begin to create an atlas of human brain cell types. In this dataset, transcriptomic profiles of individual human cortical neurons were assayed by the SMART-seq v4 single nucleus RNA-seq method following acute brain slice patch-clamp electrophysiology recordings and nucleus extraction, i.e. the Patch-seq method.||Use of data is limited to brain in health and disease||Glutamatergic Neuron Types in Layer 2 and Layer 3 of the Human Cortex||dat-18e8800|
|BRAIN/NeMO: Cell type-specific 3D epigenomes in the developing human cortex||This NIH grant U01DA052713 (PI: Shen, Yin) studies cis-regulatory chromatin interactions, open chromatin peaks, and transcriptomes for radial glia, intermediate progenitor cells, excitatory neurons, and interneurons isolated from mid-gestational human cortex samples.||These data have a data use limitation of General Research Use. Use of the data is limited only by the terms of the Data Use Certification and the requestor must provide documentation of local IRB approval.||dat-uioqy8b|
|BRAIN/NeMO: A Cellular Resolution Census of the Developing Human Brain||This NIH grant U01MH114825 (PI: Kriegstein, Arnold) aims to create a spatiotemporal single cell resolution map of the developing human neocortex to establish how many distinct cell types are present and to unravel their complex developmental history.||Use of the data is limited to health, medical, biomedical purposes and use of the data is limited to not-for-profit organizations.||Single-Cell Sequencing of the Developing Human Brain||dat-t4xnmho (10x v2); dat-qf43d3z (10x v3)|
|BRAIN/NeMO: A Multimodal atlas of human brain cell types: Human variation RNAseq & WGS (Lein)||This dataset funded by Allen Institute for Brain Science (PI: Lein, Ed) is part of a multimodal investigation into human variation of brain cell types using RNA sequencing and whole genome shotgun sequencing.||Use of the data is limited to General Research Use (GRU).||Characterization of transcriptomic and genomic variation in adult human cortex||dat-3seoxdg (10x); dat-ofcr6j0 (wgs)|
|BRAIN/NeMO: A Multimodal atlas of human brain cell types: Human variation RNAseq & WGS, Limited to Brain Health & Disease (Lein)||This dataset is associated with permission group ‘BRAIN/NeMO: A Multimodal atlas of human brain cell types: Human variation RNAseq & WGS’. Access approval for this permission group WILL include access to the associated GRU permission group, please do not apply for access to both groups.||Use of this dataset is limited to Brain Health & Disease.||Characterization of transcriptomic and genomic variation in adult human cortex||dat-3seoxdg (10x); dat-ofcr6j0 (wgs)|
|BRAIN/NeMO: A multimodal atlas of human brain cell types: Human Cortex RNASeq||This study is part of an international consortium approach using single cell transcriptomics, human cellular physiology and anatomy and neuronal modeling to begin to create an atlas of human brain cell types. In this dataset, single nucleus RNA-sequencing using SMART-Seq v4 methodology was applied to perform a comprehensive analysis of cell types in a variety of human cortical areas (MTG, M1C, CgGr, V1C, S1C and A1C), largely from postmortem brain.||Limited to General Research Use (GRU)||Human Multiple Cortical Areas-SMART-seq; Cellular diversity in human, non-human primate, and mouse LGN; An atlas of human-specialized cortical cell types||dat-2c8otht; dat-9fbmh99|
|BRAIN/NeMO: A multimodal atlas of human brain cell types: Middle Temporal Gyrus & Human Cortex RNASeq||This study is part of an international consortium approach using single cell transcriptomics, human cellular physiology and anatomy and neuronal modeling to begin to create an atlas of human brain cell types. In this dataset, single nucleus RNA-sequencing using SMART-Seq v4 methodology was applied to perform a comprehensive analysis of cell types from middle temporal gyrus (MTG) from neurosurgical tissue and in a variety of human cortical areas (MTG, M1C, CgGr, V1C, S1C and A1C), largely from postmortem brain.||Human cortical RNAseq data is limited to General Research Use (GRU) and the Middle Temporal Gyrus data is limited to Health/Medical/Biomedical (HMB).||Human Multiple Cortical Areas-SMART-seq||dat-swzf4kc|
|BRAIN/NeMO: Single Cell ATACseq of the Developing Human Brain (Nowakowski)||This dataset, funded under NIH grant 1U01MH114825-01 (PI: Kriegstein, Arnold) is part of an investigation into chromatin accessibility of cells from eight distinct areas of the developing human forebrain using single cell ATAC-seq (scATACseq).||Use of the data is limited to General Research Use (GRU).||Single-Cell ATAC-seq of the Developing Human Brain||dat-dgarbc3|
|BRAIN/NeMO: Single-cell analysis of prenatal and postnatal human cortical development||Controlled access to genomics data housed at the BRAIN/NeMO data archive. This study (NIH grant U01MH114825, PI: Kriegstein, Arnold) aims to create a spatiotemporal single cell resolution map of the developing human neocortex to establish how many distinct cell types are present and to unravel their complex developmental history. These data have a consent-based data use limitation of General Research Use. The BRAIN/NeMO DAC will review data access requests in consideration of this data use limitation. For further instructions on how to access this dataset, go to https://nemoarchive.org/resources/accessing-controlled-access-data.php.||Use of the data is limited to General Research Use (GRU).||Single-Nuclei Sequencing of Late Stages of Human Brain Development and Early Postnatal Life||dat-b3brzfa|
|BRAIN/NeMO: A single-cell genomic atlas for maturation of the human cerebellum during early childhood||This dataset, funded under NIH grant R21MH128462 (PI: Seth Ament) is part of an investigation involving single-cell RNA sequencing of the post-mortem human cerebellum characterized transcriptional changes in Purkinje and Golgi neurons during early childhood. The study revealed that these developmental gene regulatory programs are prematurely down-regulated in the brains of children who perished under conditions that included inflammation compared to those who succumbed to sudden accidental death. These data have a consent-based data use limitation of General Research Use. The BRAIN/NeMO DAC will review data access requests in consideration of this data use limitation. For further instructions on how to access this dataset, go to https://nemoarchive.org/resources/accessing-controlled-access-data.php.||Use of the data is limited to General Research Use (GRU).||dat-sxfwwo8|
Permissions for restricted data access at NeMO are being facilitated by the NIMH Data Archive (NDA). NDA and NeMO are working together to ensure a smooth process. NDA provides an SOP for institutionally sponsored data access requests here, however this page outlines the steps required for NeMO-specific restricted data access.
Step 1. Log in to NDA
NDA now uses RAS for login. If you are unable to log in with RAS, please create a Login.gov account associated with your institutional email address. NDA account requests MUST be made using an institutional email address. Account requests made from a personal account will not be honored by NeMO or NDA and will therefore slow down the process of accessing data.
Step 2. Identify Datasets available through the NDA Dashboard
Click on the Data Permissions tab. Scroll down to the NDA Controlled Access Permission Groups. Here you will find all BRAIN/NeMO Data Archives datasets currently available for access request. To the right of your dataset(s) of interest is an 'Actions' dropdown. Select "Request Access". You must work at a research institution that has an active Federal-Wide Assurance in order to initiate a data access request.
Step 3. Data Access Request Tool
This will open the Data Access Request Tool where you will provide information pertaining to your research, institution, and collaborators. Please carefully review the instructions for properly filling out all tabs of the request tool:
A) Request Access Instructions
B) Research Data Use Statement
Access requests for controlled access permission groups should include a Research Data Use Statement that appropriately addresses consent-based data use limitations for that permission group. To determine if there are consent-based data use limitations to which authorized researchers must adhere, refer to the “Data Use Limitations” field next to the BRAIN/NeMO dataset of interest in the NDA Controlled Access Permission Group table.
C) Authorized Research Institute
You must work at a research institution that has an active Federal-Wide Assurance in order to initiate a data access request. The signing official(s) associated with your institution will automatically appear as a selectable option.
D) Other Access Recipients
Each data access application is restricted to users from a single institution. If you have collaborators at other organizations, they must submit a separate data access application.
Step 4. Download Data Use Certificate (DUC)
Download the Data Use Certification Agreement PDF from the Agreement tab and complete with signatures of both the investigator and the institutional Signing Official. Contact the NDA Help Desk if you need assistance identifying Signing Officials at your research institution.
Step 5. Upload signed DUC
Log into the NDA Permissions Dashboard and upload the signed DUC to the “Active Requests” section at the top of the NDA Permissions Dashboard.
Step 6. Review
Your data access request will be reviewed by an NIH Data Access Committee (DAC). This process typically takes about one week.
Step 7. Access request decision
NDA will inform investigators of a final access decision. At this time, you will need to forward your Access Approval email to the NeMO team at firstname.lastname@example.org. Failure to do so will delay our sending you your access credentials. NeMO will grant data access to investigators for one year, after which investigators will need to reapply for access using the process described above.
using Google Cloud Platform
If you are a legacy data user and still have access via Aspera, those download instructions can be found here.
Restricted NeMO data is now available through Google Cloud Platform (GCP). There are two mechanisms for access, web interface and command line.
NeMO data is requester pays, therefore downloading by either mechanism requires the use of a Google billing account. In order to avoid incurring large charges for data download, we strongly recommend that you run data analyses on GCP if possible.
Step 1. Creating an institutional Google Cloud account
Once you have received your NDA approval, the next step is to set up an institutional google account. This can be done by going to https://myaccount.google.com/ and selecting Create Account, using the SAME institutional email address that you used for NDA account creation. Not sure what email address is associated with your NDA account? Log in to NDA to access your user dashboard. You will find your email address under your Profile.
For example, if you used the email address email@example.com when you set up your NDA account, then you would now create a new google account using firstname.lastname@example.org
Once set up, notify email@example.com as to your institutional google account creation. At this time, we will configure permissions and provide you with the bucket name(s).
Step 2. Setting up GCP Billing
To set up a new billing account go to https://console.cloud.google.com/billing, click CREATE ACCOUNT and follow the instructions.
More information on billing accounts is available here.
Step 3a. Access via the GCP Browser Web Interface
Go to https://console.cloud.google.com/storage/browser/<bucket name without leading gs:// >
For example, https://console.cloud.google.com/storage/browser/human-cortex
In the upper right corner, ensure that you are logged in with your institutional account, not a personal account, or you will not see any data listed.
If it is not already populated, click on the button to select the billing account that you previously created.
Navigate by clicking on the directory listed in the table. Individual files can be downloaded using the GCP Browser. Batch downloads require running the gsutil command line tool. Click on the directory you want to download, and click on DOWNLOAD in the menu directly above the data table. A popup will appear providing the gsutil command to run on your command line. For more on gsutil, read on.
Step 3b. Access via gsutil on the command line
Instructions for installing gsutil as part of the Google Cloud SDK are available here.
To access restricted data, you must authenticate your account. At the command line prompt, type
gcloud auth login
Follow the directions on the terminal, which will point you to a URL which you must navigate to from your browser. Here you will log in to your institutional google account. Once logged in, you will be provided with a verification code on your browser screen. Copy this and paste it onto the prompt on the command line. You should then see a message verifying your account, and billing project, if available.
To list bucket contents,
gsutil -u [billing-project] ls -l gs://bucket
gsutil -u my-billing-project ls -l gs://human-cortex
To download contents,
gsutil -u [billing-project] cp gs://bucket/file.txt /path/to/local/machine/file.txt
gsutil -u my-billing-project cp gs://human-cortex/transcriptome/scell/SSv4/human/raw/Ex_sample_01.fastq.tar /Users/jdoe/Desktop/Ex_sample_01.fastq.tar
Batch downloading a directory can be done in the same way, adding the recursive option to the copy command, if necessary,
gsutil -u [billing-project] cp -r gs://bucket/* /path/to/local/machine/
gsutil -u my-billing-project cp -r gs://human-cortex/transcriptome/* /Users/jdoe/Desktop