User Guide¶
Thank you for using aind-data-access-api! This guide is
intended for scientists and engineers in AIND that wish to interface
with AIND databases.
We have two primary databases:
A document database (DocDB) to store unstructured JSON documents. The DocDB contains AIND metadata.
A relational database to store structured tables.
Document Database (DocDB)¶
AIND metadata records stored in the DocDB describe the metadata.nd.json
for a data asset:
_id: the unique ID of the data asset.name: the name of the data asset.location: the S3 location of the metadata, in the formats3://{bucket_name}/{name}. This is unique across records and can be used to query or identify specific records.Please see the readthedocs for aind-data-schema for more details.
The DocDB can be accessed through a public read-only REST API or through a direct connection using SSH. For a direct connection, it is assumed you have the appropriate credentials.
REST API¶
A GET request to
https://api.allenneuraldynamics.org/v1/metadata_index/data_assetswith appropriate query parameters will return a list of records found.
import json
import requests
URL = "https://api.allenneuraldynamics.org/v1/metadata_index/data_assets"
filter = {"subject.subject_id": "731015"}
limit = 500
response = requests.get(URL, params={"filter": json.dumps(filter), "limit": limit})
print(response.json())
We provide a Python client (recommended):
from aind_data_access_api.document_db import MetadataDbClient
API_GATEWAY_HOST = "api.allenneuraldynamics.org"
# Default database and collection names are set in the client
# To override the defaults, provide the database and collection
# parameters in the constructor
docdb_api_client = MetadataDbClient(
host=API_GATEWAY_HOST,
)
filter = {"subject.subject_id": "731015"}
response = docdb_api_client.retrieve_docdb_records(
filter_query=filter,
)
print(response)
Direct Connection (SSH) - Database UI (MongoDB Compass)¶
MongoDB Compass is a database GUI that can be used to query and interact with our document database.
To connect:
If provided a temporary SSH password, please first run
ssh {ssh_username}@{ssh_host}and set a new password.Download the full version of MongoDB Compass.
When connecting, click “Advanced Connection Options” and use the configurations below. Leave any unspecified fields on their default setting.
Tab |
Config |
Value |
|---|---|---|
General |
Host |
|
Authentication |
Username |
|
Password |
|
|
Authentication Mechanism |
SCRAM-SHA-1 |
|
TLS/SSL |
SSL/TLS Connection |
OFF |
Proxy/SSH |
SSH Tunnel/ Proxy Method |
SSH with Password |
SSH Hostname |
|
|
SSH Port |
22 |
|
SSH Username |
|
|
SSH Password |
|
|
(Optional) Advanced |
Read Preference |
Secondary Preferred |
Replica Set Name |
rs0 |
You should be able to see the home page with the
metadata_indexdatabase. It should have 1 single collection calleddata_assets.If provided with a temporary DocDB password, please change it using the embedded mongo shell in Compass, and then reconnect.
db.updateUser(
"doc_db_username",
{
pwd: passwordPrompt()
}
)
Direct Connection (SSH) - Python Client¶
We have some convenience methods to interact with our Document Store. You can create a client by explicitly setting credentials, or downloading from AWS Secrets Manager.
If using credentials from environment, please configure:
DOC_DB_HOST=************.us-west-2.docdb.amazonaws.com
DOC_DB_USERNAME=doc_db_username
DOC_DB_PASSWORD=doc_db_password
DOC_DB_SSH_HOST=ssh_host
DOC_DB_SSH_USERNAME=ssh_username
DOC_DB_SSH_PASSWORD=ssh_password
To use the client:
from aind_data_access_api.document_db_ssh import DocumentDbSSHClient, DocumentDbSSHCredentials
# Method 1) if credentials are set in environment
credentials = DocumentDbSSHCredentials()
# Method 2) if you have permissions to AWS Secrets Manager
# Each secret must contain corresponding "host", "username", and "password"
credentials = DocumentDbSSHCredentials.from_secrets_manager(
doc_db_secret_name="/doc/db/secret/name", ssh_secret_name="/ssh/tunnel/secret/name"
)
with DocumentDbSSHClient(credentials=credentials) as doc_db_client:
# To get a list of filtered records:
filter = {"subject.subject_id": "731015"}
projection = {
"name": 1, "created": 1, "location": 1, "subject.subject_id": 1, "subject.date_of_birth": 1,
}
count = doc_db_client.collection.count_documents(filter)
response = list(doc_db_client.collection.find(filter=filter, projection=projection))
RDS Tables¶
We have some convenience methods to interact with our Relational Database. You can create a client by explicitly setting credentials, or downloading from AWS Secrets Manager.
from aind_data_access_api.rds_tables import RDSCredentials, Client
# Method one assuming user, password, and host are known
ds_client = Client(
credentials=RDSCredentials(
username="user",
password="password",
host="host",
dbname="dev",
),
)
# Method two if you have permissions to AWS Secrets Manager
ds_client = Client(
credentials=RDSCredentials(
aws_secrets_name="aind/data/access/api/rds_tables"
),
)
# To retrieve a table as a pandas dataframe
df = ds_client.read_table(table_name="spike_sorting_urls")
# Can also pass in a custom sql query
cursor_result = ds_client.execute_query(query="SELECT * FROM spike_sorting_urls")
# It's also possible to save a pandas dataframe as a table. Please check internal documentation for more details.
ds_client.overwrite_table_with_df(df, table_name)
Installation¶
Basic installation:
pip install aind-data-access-api
Optional Dependencies¶
Different features require different optional dependencies:
To use DocDB features (including
MetadataDbClient):
pip install "aind-data-access-api[docdb]"
To use RDS features:
pip install "aind-data-access-api[rds]"
To use AWS Secrets management:
pip install "aind-data-access-api[secrets]"
To use the helpers package:
pip install "aind-data-access-api[helpers]"
To install all optional dependencies:
pip install "aind-data-access-api[full]"
Note: When using zsh or other shells that interpret square brackets, the quotes around the install argument are required.
Reporting bugs or making feature requests¶
Please report any bugs or feature requests here: issues