memgraph · as51340 · Oct 27, 2025 · Oct 27, 2025 · Oct 27, 2025 · Oct 28, 2025
@@ -15,7 +15,7 @@ instance. Whether your data is structured in files, relational databases, or
 other graph databases, Memgraph provides the flexibility to integrate and
 analyze your data efficiently.
 
-Memgraph supports file system imports like CSV files, offering efficient and
+Memgraph supports file system imports like Parquet and CSV files, offering efficient and
 structured data ingestion. **However, if you want to migrate directly from
 another data source, you can use the [`migrate`
 module](/advanced-algorithms/available-algorithms/migrate)** from Memgraph MAGE
@@ -31,6 +31,11 @@ In order to learn all the pre-requisites for importing data into Memgraph, check
 
 ## File types
 
+### Parquet files
+
+Parquet files can be imported efficiently from the local disk and from s3:// using the
+[LOAD PARQUET clause](/querying/claused/load-parquet).
+
 ### CSV files
 
 CSV files provide a simple and efficient way to import tabular data into Memgraph 
@@ -262,4 +267,4 @@ nonsense or sales pitch, just tech.
   />
 </Cards>
 
-<CommunityLinks/>
+<CommunityLinks/>
@@ -1,6 +1,7 @@
 export default {
   "best-practices": "Best practices",
   "csv": "CSV",
+  "parquet": "PARQUET",
   "json": "JSON",
   "cypherl": "CYPHERL",
   "migrate-from-neo4j": "Migrate from Neo4j",

@@ -572,4 +572,4 @@ For more information about `Delta` objects, check the
 information on the [IN_MEMORY_TRANSACTIONAL storage mode](/fundamentals/storage-memory-usage#in-memory-transactional-storage-mode-default).
 
 
-<CommunityLinks/>
+<CommunityLinks/>
@@ -0,0 +1,252 @@
+---
+title: Import data from Parquet files
+description: Leverage Parquet files in Memgraph operations. Our detailed guide simplifies the process for an enhanced graph computing journey.
+---
+
+import { Callout } from 'nextra/components'
+import { Steps } from 'nextra/components'
+import { Tabs } from 'nextra/components'
+
+# Import data from Parquet file
+
+The data from Parquet files can be imported using the [`LOAD PARQUET` Cypher clause](#load-parquet-cypher-clause) from the local disk
+and from the s3.
+
+## `LOAD PARQUET` Cypher clause
+
+The `LOAD PARQUET` clause uses a background thread that reads column batches, assembles batch of 64K rows and puts it on the queue from
+where the main thread pulls the data. The main thread then reads row by row from the queue, binds the contents of the parsed row to the
+specified variable, populates the database if it is empty or appends new data to an existing dataset.
+
+### `LOAD PARQUET` clause syntax
+
+</Callout>
+
+The syntax of the `LOAD PARQUET` clause is:
+
+```cypher
+LOAD PARQUET FROM <parquet-location> ( WITH CONFIG configs=configMap ) ? AS <variable-name>
+```
+
+- `<parquet-location>` is a string of the location of the Parquet file.<br/> Without a
+  s3:// prefix, it refers to a path on the local and with s3:// prefix, it pulls the file with specified URI from the S3-compatible storage.
+  There are no restrictions on where in
+  your file system the file can be located, as long as the path is valid (i.e.,
+  the file exists). If you are using Docker to run Memgraph, you will need to
+  [copy the files from your local directory into
+  Docker](/getting-started/first-steps-with-docker#copy-files-from-and-to-a-docker-container)
+  container where Memgraph can access them. <br/>
+
+- `<configs>` Represents an optional configuration map through which you can specify configuration options: `aws_region`, `aws_access_key`, `aws_secret_key` and `aws_endpoint_url`.
+  - `<aws_region>`: The region in which your S3 service is being located
+  - `<aws_access_key>`: Access key used to connect to S3 service
+  - `<aws_secret_key>`: Secret key used to connect S3 service
+  - `<aws_endpoint_url`>: Optional configuration parameter. Can be used to set the URL of the S3 compatible storage.
+- `<variable-name>` is a symbolic name representing the variable to which the
+  contents of the parsed row will be bound to, enabling access to the row
+  contents later in the query. The variable doesn't have to be used in any
+  subsequent clause.
+
+### `LOAD PARQUET` clause specificities
+
+When using the `LOAD PARQUET` clause please keep in mind:
+
+- The parser parses the values in their appropriate type so you should get the same type as in the Parquet file. Types `BOOL`, `INT8`, `INT16`, `INT32`, `INT64`, `UINT8`, `UINT16`, `UINT32`, `UINT64`,
+  `HALF_FLOAT`, `FLOAT`, `DOUBLE`, `STRING`, `LARGE_STRING`, `STRING_VIEW`, `DATE32`, `DATE64`, `TIME32`, `TIME64`, `TIMESTAMP`, `DURATION`, `DECIMAL128`, `DECIMAL256`, `BINARY`, `LARGE_BINARY`, `FIXED_SIZE_BINARY`,
+  `LIST` and `MAP` are supported. Unsupported types will be saved as string in Memgraph.
+
+- Authentication parameters (`aws_region`, `aws_access_key`, `aws_secret_key` and `aws_endpoint_url`) can be provided in the `LOAD PARQUET` query using WITH CONFIG construct, through environment variables 
+  (`AWS_REGION`, `AWS_ACCESS_KEY`, `AWS_SECRET_KEY` and `AWS_ENDPOINT_URL`) and through run-time database settings. For setting authentication parameters through run-time settings, use `SET DATABASE SETTING <key> to <value>;`
+  query. Keys of this authentication parameters are `aws.access_key`, `aws.region`, `aws.secret_key` and `aws.endpoint_url`.
+
+- **The `LOAD PARQUET` clause is not a standalone clause**,  meaning a valid query
+  must contain at least one more clause, for example:
+
+  ```cypher
+  LOAD PARQUET FROM "/people.parquet" AS row
+  CREATE (p:People) SET p += row;
+  ```
+
+  In this regard, the following query will throw an exception:
+
+  ```cypher
+  LOAD PARQUET FROM "/file.parquet" AS row;
+  ```
+
+  **Adding a `MATCH` or `MERGE` clause before LOAD PARQUET** allows you to match certain
+  entities in the graph before running LOAD PARQUET, optimizing the process as
+  matched entities do not need to be searched for every row in the PARQUET file.  
+
+  But, the `MATCH` or `MERGE` clause can be used prior the `LOAD PARQUET` clause only
+  if the clause returns only one row. Returning multiple rows before calling the
+  `LOAD PARQUET` clause will cause a Memgraph runtime error.
+
+- **The `LOAD PARQUET` clause can be used at most once per query**, so queries like
+  the one below will throw an exception:
+
+  ```cypher
+  LOAD PARQUET FROM "/x.parquet" AS x
+  LOAD PARQUET FROM "/y.parquet" AS y
+  CREATE (n:A {p1 : x, p2 : y});
+  ```
+
+### Increase import speed
+
+The `LOAD PARQUET` clause will create relationships much faster and consequently
+speed up data import if you [create indexes](/fundamentals/indexes) on nodes or
+node properties once you import them:
+
+```cypher
+  CREATE INDEX ON :Node(id);
+```
+
+If the LOAD PARQUET clause is merging data instead of creating it, create indexes
+before running the LOAD PARQUET clause.
+
+
+The construct `USING PERIODIC COMMIT <BATCH_SIZE>` also improves the import speed because
+it optimizes some of the memory allocation patterns. In our benchmarks, this construct
+speeds up the execution from 25% to 35%.
+
+```cypher
+  USING PERIODIC COMMMIT 1024 LOAD PARQUET FROM "/x.parquet" AS x
+  CREATE (n:A {p1 : x, p2 : y});
+```
+
+
+You can also speed up import if you switch Memgraph to [**analytical storage
+mode**](/fundamentals/storage-memory-usage#storage-modes). In the analytical
+storage mode there are no ACID guarantees besides manually created snapshots.
+After import you can switch the storage mode back to
+transactional and enable ACID guarantees.
+
+You can switch between modes within the session using the following query:
+
+```cypher
+STORAGE MODE IN_MEMORY_{TRANSACTIONAL|ANALYTICAL};
+```
+
+If you use `IN_MEMORY_ANALYTICAL` mode and have nodes and relationships stored in
+ separate PARQUET files, you can run multiple concurrent `LOAD PARQUET` queries to import data even faster.
+In order to achieve the best import performance, split your nodes and relationships
+files into smaller files and run multiple `LOAD PARQUET` queries in parallel. 
+The key is to run all `LOAD PARQUET` queries which create nodes first. After that, run 
+all `LOAD PARQUET` queries that create relationships. 
+
+
+### Import multiple Parquet files with distinct graph objects
+
+In this example, the data is split across four files, each file contains nodes
+of a single label or relationships of a single type. 
+
+<Steps>
+
+  {<h3 className="custom-header">Download the files</h3>}
+
+  - [`people_nodes.parquet`](https://public-assets.memgraph.com/import-data/load-csv-cypher/multiple-types-nodes/people_nodes.parquet) is used to create nodes labeled `:Person`.<br/> The file contains the following data:
+    ```parquet
+    id,name,age,city
+    100,Daniel,30,London
+    101,Alex,15,Paris
+    102,Sarah,17,London
+    103,Mia,25,Zagreb
+    104,Lucy,21,Paris
+    ```
+- [`restaurants_nodes.parquet`](https://public-assets.memgraph.com/import-data/load-csv-cypher/multiple-types-nodes/restaurants_nodes.parquet) is used to create nodes labeled `:Restaurants`.<br/> The file contains the following data:
+    ```parquet
+    id,name,menu
+    200,Mc Donalds,Fries;BigMac;McChicken;Apple Pie
+    201,KFC,Fried Chicken;Fries;Chicken Bucket
+    202,Subway,Ham Sandwich;Turkey Sandwich;Foot-long
+    203,Dominos,Pepperoni Pizza;Double Dish Pizza;Cheese filled Crust
+    ```
+
+- [`people_relationships.parquet`](https://public-assets.memgraph.com/import-data/load-csv-cypher/multiple-types-nodes/people_relationships.parquet) is used to connect people with the `:IS_FRIENDS_WITH` relationship.<br/> The file contains the following data:
+    ```parquet
+    first_person,second_person,met_in
+    100,102,2014
+    103,101,2021
+    102,103,2005
+    101,104,2005
+    104,100,2018
+    101,102,2017
+    100,103,2001
+    ```
+-  [`restaurants_relationships.parquet`](https://public-assets.memgraph.com/import-data/load-csv-cypher/multiple-types-nodes/restaurants_relationships.parquet) is used to connect people with restaurants using the `:ATE_AT` relationship.<br/> The file contains the following data:
+    ```parquet
+    PERSON_ID,REST_ID,liked
+    100,200,true
+    103,201,false
+    104,200,true
+    101,202,false
+    101,203,false
+    101,200,true
+    102,201,true
+    ```
+
+  {<h3 className="custom-header">Check the location of the Parquet files</h3>}
+  If you are working with Docker, [copy the files from your local directory into
+  the Docker container](/getting-started/first-steps-with-docker#copy-files-from-and-to-a-docker-container)
+  so that Memgraph can access them.
+
+  {<h3 className="custom-header">Import nodes</h3>}
+
+  Each row will be parsed as a map, and the
+  fields can be accessed using the property lookup syntax (e.g. `id: row.id`).
+
+  The following query will load row by row from the file, and create a new node
+  for each row with properties based on the parsed row values:
+
+      ```cypher
+      LOAD PARQUET FROM "/path-to/people_nodes_wh.parquet" AS row
+      CREATE (n:Person {id: row.id, name: row.name, age: row.age, city: row.city});
+      ```
+
+  In the same manner, the following query will create new nodes for each restaurant:
+
+      ```cypher
+      LOAD PARQUET FROM "/path-to/restaurants_nodes.parquet" AS row
+      CREATE (n:Restaurant {id: row.id, name: row.name, menu: row.menu});
+      ```
+
+  {<h3 className="custom-header">Create indexes</h3>}
+
+  Creating an [index](/fundamentals/indexes) on a property used to connect nodes
+  with relationships, in this case, the `id` property of the `:Person` nodes,
+  will speed up the import of relationships, especially with large datasets:
+
+      ```cypher
+      CREATE INDEX ON :Person(id);
+      ```
+
+  {<h3 className="custom-header">Import relationships</h3>}
+  The following query will create relationships between the people nodes:
+
+  ```cypher
+  LOAD PARQUET FROM "/path-to/people_relationships.parquet" AS row
+  MATCH (p1:Person {id: row.first_person})
+  MATCH (p2:Person {id: row.second_person})
+  CREATE (p1)-[f:IS_FRIENDS_WITH]->(p2)
+  SET f.met_in = row.met_in;
+  ```
+
+  The following query will create relationships between people and restaurants where they ate:
+
+  ```cypher
+  LOAD PARQUET FROM "/path-to/restaurants_relationships.parquet" AS row
+  MATCH (p1:Person {id: row.PERSON_ID})
+  MATCH (re:Restaurant {id: row.REST_ID})
+  CREATE (p1)-[ate:ATE_AT]->(re)
+  SET ate.liked = ToBoolean(row.liked);
+  ```
+
+  {<h3 className="custom-header">Final result</h3>}
+  Run the following query to see how the imported data looks as a graph:
+
+  ```
+  MATCH p=()-[]-() RETURN p;
+  ```
+
+  ![](/pages/data-migration/csv/load_csv_restaurants_relationships.png)
+
+</Steps>
@@ -159,7 +159,7 @@ of the following commands:
 | Privilege to enforce [constraints](/fundamentals/constraints). | `CONSTRAINT` |
 | Privilege to [dump the database](/configuration/data-durability-and-backup#database-dump).| `DUMP` |
 | Privilege to use [replication](/clustering/replication) queries. | `REPLICATION` |
-| Privilege to access files in queries, for example, when using `LOAD CSV` clause. | `READ_FILE` |
+| Privilege to access files in queries, for example, when using `LOAD CSV` and `LOAD PARQUET` clauses. | `READ_FILE` |
 | Privilege to manage [durability files](/configuration/data-durability-and-backup#database-dump). | `DURABILITY` |
 | Privilege to try and [free memory](/fundamentals/storage-memory-usage#deallocating-memory). | `FREE_MEMORY` |
 | Privilege to use [trigger queries](/fundamentals/triggers). | `TRIGGER` |

@@ -318,6 +318,10 @@ fallback to the value of the command-line argument.
 | hops_limit_partial_results | If set to `true`, partial results are returned when the hops limit is reached. If set to `false`, an exception is thrown when the hops limit is reached. The default value is `true`.                                                  | yes                     |
 | timezone                   | IANA timezone identifier string setting the instance's timezone.                                                                                                                                                                       | yes                     |
 | storage.snapshot.interval  | Define periodic snapshot schedule via cron expression ([crontab](https://crontab.guru/) format, an [Enterprise feature](/database-management/enabling-memgraph-enterprise)) or as a period in seconds. Set to empty string to disable. | no                      |
+| aws.region                 | AWS region in which your S3 service is located.                                                                                                                                                                                        | yes                     |
+| aws.access_key             | Access key used to READ the file from S3.                                                                                                                                                                                              | yes                     |
+| aws.secret_key             | Secret key used to READ the file from S3.                                                                                                                                                                                              | yes                     |
+| aws.endpoint_url           | URL on which S3 can be accessed (if using some other S3-compatible storage).                                                                                                                                                           | yes                     |
 
 All settings can be fetched by calling the following query:
 
@@ -481,6 +485,19 @@ connections in Memgraph.
 | `--stream-transaction-retry-interval=500`  | The interval to wait (measured in milliseconds) before retrying to execute again a conflicting transaction. | `[uint32]` |
 
 
+### AWS
+
+This section contains the list of flags that are used when connecting to S3-compatible storage.
+
+
+| Flag                                       | Description                                                                                                 | Type       |
+|--------------------------------------------|-------------------------------------------------------------------------------------------------------------|------------|
+| `--aws-region`                             | AWS region in which your S3 service is located.                                                             | `[string]` |
+| `--aws-access-key`                         | Access key used to READ the file from S3.                                                                   | `[string]` |
+| `--aws-secret-key`                         | Secret key used to READ the file from S3.                                                                   | `[string]` |
+| `--aws-endpoint-url`                       | URL on which S3 can be accessed (if using some other S3-compatible storage).                                | `[string]` |
+
+
 ### Other
 
 This section contains the list of all other relevant flags used within Memgraph.

@@ -212,11 +212,11 @@ us](https://memgraph.com/enterprise-trial) for more information.
 
 ### What is the fastest way to import data into Memgraph?
 
-Currently, the fastest way to import data is from a CSV file with a [LOAD CSV
-clause](/data-migration/csv). Check out the [best practices for importing
+Currently, the fastest way to import data is from a Parquet file with a [LOAD PARQUET
+clause](/data-migration/parquet). Check out the [best practices for importing
 data](/data-migration/best-practices).
 
-[Other import methods](/data-migration) include importing data from JSON and CYPHERL files,
+[Other import methods](/data-migration) include importing data from CSV, JSON and CYPHERL files,
 migrating from relational databases, or connecting to a data stream.
 
 ### How to import data from MySQL or PostgreSQL?
@@ -226,11 +226,11 @@ You can migrate from [MySQL](/data-migration/migrate-from-rdbms) or
 
 ### What file formats does Memgraph support for import? 
 
-You can import data from [CSV](/data-migration/csv),
+You can import data from [CSV](/data-migration/csv), [PARQUET](/data-migration/parquet)
 [JSON](/data-migration/json) or [CYPHERL](/data-migration/cypherl) files. 
 
 CSV files can be imported in on-premise instances using the [LOAD CSV
-clause](/data-migration/csv), and JSON files can be imported using a
+clause](/data-migration/csv), PARQUET files can be imported using the [LOAD PARQUET](/data-migration/parquet) and JSON files can be imported using a
 [json_util](/advanced-algorithms/available-algorithms/json_util) module from the
 MAGE library. On a Cloud instance, data from CSV and JSON files can be imported only
 from a remote address. 

@@ -165,6 +165,10 @@ JSON files, and import data using queries within a CYPHERL file.
     title="JSON"
     href="/data-migration/json"
   />
+  <Cards.Card
+    title="PARQUET"
+    href="/data-migration/parquet"
+  />
   <Cards.Card
     title="CYPHERL"
     href="/data-migration/cypherl"
@@ -337,4 +341,4 @@ Ensure alignment with the latest updates and changes.
   />
 </Cards>
 
-<CommunityLinks/>
+<CommunityLinks/>
Original file line number	Diff line number	Diff line change
Expand Up		@@ -572,4 +572,4 @@ For more information about `Delta` objects, check the
		information on the [IN_MEMORY_TRANSACTIONAL storage mode](/fundamentals/storage-memory-usage#in-memory-transactional-storage-mode-default).


		<CommunityLinks/>
		<CommunityLinks/>