Rell dapp code optimization

This topic provides insights into the process of analyzing and optimizing your dapp code written in Rell. You can use the Chromia CLI to examine SQL statements generated by your dapp, identify performance bottlenecks, and implement optimizations.

Consider the following simple Rell dapp:

# src/main.rell
module;

entity owner {
    name;
}

entity housekey {
    pubkey;
    owner;
}

entity house {
    address: text;
    owner;
}

entity house_owner {
    key house, owner;
}

query get_house_key(address: text): list<(name:text,housekey:pubkey)> {
  return (housekey, house_owner) @* {
    housekey.owner == .house.owner,
    house_owner.house.address == address
    } (
      name = house_owner.owner.name,
      housekey = housekey.pubkey
      );
}

# chromia.yml
blockchains:
  house-key-example:
    module: main
database:
  schema: schema_house-key-example

Analyzing SQL Statements

We aim to gain insights into how the query executes on the server to determine if any optimizations are needed. You can achieve this using the Chromia CLI's repl command with the --sql-log argument, which logs all SQL statements executed. Follow these steps:

Start the repl with the main module loaded:

$ chr repl --sql-log --use-db --module main
Rell 0.13.1
Type '\q' to quit or '\?' for help.
Current module: 'main'
>>>

This initiates an interactive Rell session where you can analyze any statement. Execute the get_house_key query with your desired input to observe the results:

>>> get_house_key("foo")
INFO [main] SqlConnectionLogger - [0]
    SELECT A05."name", A00."pubkey" FROM "c0.housekey" A00
    INNER JOIN "c0.owner" A02 ON A00."owner" = A02."rowid", "c0.house_owner" A01
    INNER JOIN "c0.house" A03 ON A01."house" = A03."rowid"
    INNER JOIN "c0.owner" A05 ON A01."owner" = A05."rowid"
    INNER JOIN "c0.owner" A04 ON A03."owner" = A04."rowid"
    WHERE (A02."rowid" = A04."rowid") AND (A03."address" = ?) ORDER BY A00."rowid", A01."rowid"

To enhance the query's performance, it's essential to consider various factors, with the impact varying depending on the volume of data in the tables (entities).

Optimizing Dapp code

Enhancing the performance of your Dapp involves several optimization techniques:

Keys and Indexing

Identify fields used in SELECT, JOIN, and WHERE clauses and introduce keys or indexes to boost performance. Fields directly referenced in the WHERE clause should have keys, while fields used in JOINs should be indexed.

For example, we can identify specific fields that play essential roles in the SELECT, JOIN, and WHERE clauses. These fields include A05."name", "A00."pubkey", A00."owner", A01."house", A01."owner", A03."owner", and A03."address". Recognizing these key fields suggests that enhancing performance can be achieved through indexing or implementing keys on these particular attributes. Considering this observation alone, we would proceed to make the subsequent refinements to our Dapp:

entity owner {
  key name;
}

entity housekey {
  key pubkey;
  index owner;
}

entity house {
  key address: text;
  index owner;
}

entity house_owner {
  key house, owner;
  index house, owner;
}

Keys are applied to uniquely identifiable fields directly referenced in the WHERE clause, and indexes are used for fields involved in JOIN operations.

Optimizing statement order

When optimizing your queries, pay close attention to the arrangement of conditions within your WHERE clause. To maximize efficiency, aim to minimize the number of checks by prioritizing conditions likely to yield fewer results. For instance, consider the following scenario:

In the WHERE clause, you can observe that the statement initially checks if two fields match before verifying the correctness of the address. This sequence could be significantly enhanced by reversing it. Since we anticipate only one result from the address check, placing it first would reduce the number of checks required.

WHERE (A02."rowid" = A04."rowid") AND (A03."address" = ?) ORDER BY A00."rowid", A01."rowid"

Similarly, the order of join statements can also be fine-tuned for improved performance. If you anticipate a significantly larger number of owners compared to houses, consider rearranging the first two joins. Achieve this by reordering the entities in the FROM section of the query:

The modified query is as follows:

query get_house_key(address: text): list<(name:text,housekey:pubkey)> {
  return (house_owner, housekey) @* {
    house_owner.house.address == address,
    housekey.owner == .house.owner
    } (
      name = house_owner.owner.name,
      housekey = housekey.pubkey
      );
}

Streamlining joins for improved performance

Within the scope of your query, even though it involves just two entities, you might notice that it leads to four joins. Reducing the number of joins can have a substantial impact on query performance.

Understanding join distribution

Join from the WHAT-Part: One of the joins arises from the WHAT-part, specifically name = house_owner.owner.name.
Join from the WHERE-Part: The remaining two joins originate from the WHERE-part, specifically housekey.owner == .house.owner.

Enhancing performance

To optimize query performance, consider strategically placing join conditions within the SQL statement. By introducing the house entity into the statement and testing it in the Rell console, you can achieve performance improvements:

>>> (house_owner, house, housekey) @* {
    house.address == "foo",
    house_owner.house == house,
    house_owner.owner == housekey.owner
    } (
      name = housekey.owner.name,
      housekey = housekey.pubkey
      );
INFO [main] SqlConnectionLogger - [0]
    SELECT A05."name", A02."pubkey" FROM "c0.house_owner" A00
    INNER JOIN "c0.house" A03 ON A00."house" = A03."rowid"
    INNER JOIN "c0.owner" A04 ON A00."owner" = A04."rowid", "c0.house" A01, "c0.housekey" A02 INNER JOIN "c0.owner" A05 ON A02."owner" = A05."rowid"
    WHERE ((A01."address" = ?) AND (A03."rowid" = A01."rowid")) AND (A04."rowid" = A05."rowid") ORDER BY A00."rowid", A01."rowid", A02."rowid"

This adjustment results in three joins with an additional check in the WHERE-statement. Given the involvement of three tables in this request, further reduction in joins may not be feasible. Implement this performance enhancement in your query:

query get_house_key(address: text): list<(name:text,housekey:pubkey)> {
  return (house_owner, house, housekey) @* {
    house.address == address,
    house_owner.house == house,
    house_owner.owner == housekey.owner
    } (
      name = housekey.owner.name,
      housekey = housekey.pubkey
      );
}

In short, understanding the intricacies of how SQL statements are generated from Rell code can be quite complex. However, this understanding is crucial for ensuring the scalability of your dapp. It involves considering factors like the volume of data within each entity and strategies to reduce cardinality, all of which significantly impact your dapp's performance as it scales to accommodate millions of users.

As a result of these optimization considerations, your Rell dapp code is refined to look like the following:

entity owner {
  key name;
}

entity housekey {
  key pubkey;
  index owner;
}

entity house {
  key address: text;
  index owner;
}

entity house_owner {
  key house, owner;
  index house, owner;
}

query get_house_key(address: text): list<(name:text,housekey:pubkey)> {
  return (house, housekey, house_owner) @* {
    house.address == address,
    house_owner.house == house,
    house_owner.owner == housekey.owner
    } (
      name = housekey.owner.name,
      housekey = housekey.pubkey
      );
}

Rell dapp code optimization

Analyzing SQL Statements​

Optimizing Dapp code​

Keys and Indexing​

Optimizing statement order​

Streamlining joins for improved performance​

Understanding join distribution​

Enhancing performance​