Documentation
Everything you need to get started with Schemantic and make the most of its powerful data mapping and analysis features.
Reach out to us at hello@schemantic.io to get started.
Schemantic accesses your warehouse via read-only, per-workspace service-account credentials (OAuth 2.0 / IAM), and all calculations on raw data execute inside your cloud environment. Row-level data and table contents — including values like minimums, maximums, and modes — do not leave it; where the UI displays such a value, it is passed through at render time and never stored externally. The only information temporarily held outside your cloud is structural metadata and aggregate statistics (schemas, types, null rates, row counts, join scores) — encrypted at rest and in transit, hashed where applicable, held under strict tenant isolation with no co-mingling across customers, and deleted after subscription termination. Schemantic is SOC 2 Type II attested across all five Trust Service Criteria.
Availability: Available once join inference completes. Access from workspace landing page or map icon in left nav.
Controls:
- Zoom (in/out/fit)
- Layout method (vertical/horizontal/force)
- Join lines filter (approved/recommended/review)
- Granularity (dataset only, tables only, joined columns, all columns)
Path generation: Select 2+ tables, click Generate Paths, view visual path and SQL. SQL modal supports changing join types (left/right/inner/outer) and clipboard copy.
Access via:
- Right-click table in Join Map → View Table Statistics
- Stats button in Data Sources table
- Stats button in Entities table
- Stats button in Entity Details Associated Tables
Includes: Table Composition Overview (column/row counts, column type breakdown) and Descriptive Statistics table (data type, functional type, null rate, unique values, plus type-specific stats like min/max/mean for numbers).
- Open the project
- Join Map
- Edit Workspaces
- Edit data sources
- Select/deselect data
- Save data sources
- Join your data
Universal Filter: Applies workspace-wide across all visualizations. Located at top of Join Map and Entity Hub. Supports plain text search and key:value syntax (table:, dataset:, column:) with AND/OR operators. Filters can be saved, applied before entering workspace, and deleted.
Individual visualizations: Also have their own search boxes with the same syntax.
Yes. SOC 2 Type II attested across all five Trust Service Criteria: Security, Availability, Processing Integrity, Confidentiality, and Privacy.
Check the Schemantic Glossary (link in tool).
Only PII you already have access to in the cloud provider. PII may appear in column names (rare) or in string descriptive statistics (min/max values).
For GCP: A project with BigQuery data and an account with permissions for:
resourcemanager.projects.listresourcemanager.projects.getserviceusage.services.enableiam.roles.createiam.roles.update- BigQuery permissions for datasets and tables (create, delete, get, link, setIamPolicy, update, jobs, readsessions, tables management)
Subsequent users only need project-specific BigQuery permissions. Note: row-level permissions not supported.
During GCP onboarding, temporary IAM permissions verify project access, create a service account, and grant that account limited IAM permissions for re-verifying user access on each login. If a user gains or loses GCP project access, Schemantic access follows.
Row-level data access follows your cloud provider's IAM policies. Schemantic does surface metadata — column names, join relationships, and descriptive statistics — for tables included in analysis, which may be visible to users who lack direct access to the underlying data.
The service account needs BigQuery permissions to perform data processing within your environment.
In excess of 99%, built on major cloud providers and established technologies. Scheduled maintenance may cause temporary pauses. External factors (network outages, third-party disruptions) may occasionally impact performance.
Virtually no customer data transfers to Schemantic's environment. Two exceptions for browser rendering:
- Table/column name metadata
- Calculated statistics for in-tool use (median, mean, mode)
Possible but extremely unlikely for most use cases. Schemantic runs on-demand or infrequently, completing in under an hour using a small fraction of available resources. Exceptionally massive datasets may take several hours with more resource consumption, but this is periodic.
Not a standalone compliance solution. Supporting role:
- Data Minimization: Reveals redundant storage
- Data Traceability: Clear audit trail of column/ID overlaps
- Reporting: Automates valid join identification for multi-source pulls
Does not replace compliance programs — organizations remain responsible for governance, access controls, and regulatory adherence.