5. Working with content

In this chapter, we demonstrate features of Swarm related to storage and retrieval. First we discuss how to solve mutability of resources in a content addressed system using the Ethereum Name Service on the blockchain, then using Mutable Resource Updates in Swarm. Then we briefly discuss how to protect your data by restricting access using encryption. We also discuss in detail how files can be organised into collections using manifests and how this allows virtual hosting of websites. Another form of interaction with Swarm, namely mounting a Swarm manifest as a local directory using FUSE. We conclude by summarizing the various URL schemes that provide simple http endpoints for clients to interact with Swarm.

5.1. Using ENS names

Note

In order to resolve ENS names, your Swarm node has to be connected to an Ethereum blockchain (mainnet, or testnet). See Getting Started for instructions. This section explains how you can register your content to your ENS name.

ENS is the system that Swarm uses to permit content to be referred to by a human-readable name, such as “theswarm.eth”. It operates analogously to the DNS system, translating human-readable names into machine identifiers - in this case, the Swarm hash of the content you’re referring to. By registering a name and setting it to resolve to the content hash of the root manifest of your site, users can access your site via a URL such as bzz://theswarm.eth/.

Note

Currently The bzz scheme is not supported in major browsers such as Chrome, Firefox or Safari. If you want to access the bzz scheme through these browsers, currently you have to either use an HTTP gateway, such as https://swarm-gateways.net/bzz:/theswarm.eth/ or use a browser which supports the bzz scheme, such as Mist <https://github.com/ethereum/mist>.

Suppose we upload a directory to Swarm containing (among other things) the file example.pdf.

swarm --recursive up /path/to/dir
>2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d

If we register the root hash as the content for theswarm.eth, then we can access the pdf at

bzz://theswarm.eth/example.pdf

if we are using a Swarm-enabled browser, or at

http://localhost:8500/bzz:/theswarm.eth/example.pdf

via a local gateway. We will get served the same content as with:

http://localhost:8500/bzz:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d/example.pdf

Please refer to the official ENS documentation for the full details on how to register content hashes to ENS.

In short, the steps you must take are:

  1. Register an ENS name.
  2. Associate a resolver with that name.
  3. Register the Swarm hash with the resolver as the content.

We recommend using https://manager.ens.domains/. This will make it easy for you to:

  • Associate the default resolver with your name
  • Register a Swarm hash.

Note

When you register a Swarm hash with https://manager.ens.domains/ you MUST prefix the hash with 0x. For example 0x2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d

5.1.1. Overview of ENS (video)

Nick Johnson on the Ethereum Name System

5.2. Feeds

Note

Feeds, previously known as Mutable Resource Updates, is an experimental feature, available since Swarm POC3. It is under active development, so expect things to change.

Since Swarm hashes are content addressed, changes to data will constantly result in changing hashes. Swarm Feeds provide a way to easily overcome this problem and provide a single, persistent, identifier to follow sequential data.

The usual way of keeping the same pointer to changing data is using the Ethereum Name Service (ENS). However, since ENS is an on-chain feature, it might not be suitable for each use case since:

  1. Every update to an ENS resolver will cost gas to execute
  2. It is not be possible to change the data faster than the rate that new blocks are mined
  3. ENS resolution requires your node to be synced to the blockchain

Swarm Feeds provide a way to have a persistent identifier for changing data without having to use ENS. It is named Feeds for its similarity with a news feed.

If you are using Feeds in conjunction with an ENS resolver contract, only one initial transaction to register the “Feed manifest address” will be necessary. This key will resolve to the latest version of the Feed (updating the Feed will not change the key).

You can think of a Feed as a user’s Twitter account, where he/she posts updates about a particular Topic. In fact, the Feed object is simply defined as:

type Feed struct {
  Topic Topic
  User  common.Address
}

That is, a specific user posting updates about a specific Topic.

Users can post to any topic. If you know the user’s address and agree on a particular Topic, you can then effectively “follow” that user’s Feed.

Important

How you build the Topic is entirely up to your application. You could calculate a hash of something and use that, the recommendation is that it should be easy to derive out of information that is accesible to other users.

For convenience, feed.NewTopic() provides a way to “merge” a byte array with a string in order to build a Feed Topic out of both. This is used at the API level to create the illusion of subtopics. This way of building topics allows to use a random byte array (for example the hash of a photo) and merge it with a human-readable string such as “comments” in order to create a Topic that could represent the comments about that particular photo. This way, when you see a picture in a website you could immediately build a Topic out of it and see if some user posted comments about that photo.

Feeds are not created, only updated. If a particular Feed (user, topic combination) has never posted to, trying to fetch updates will yield nothing.

5.2.1. Feed Manifests

A Feed Manifest is simply a JSON object that contains the Topic and User of a particular Feed (i.e., a serialized Feed object). Uploading this JSON object to Swarm in the regular way will return the immutable hash of this object. We can then store this immutable hash in an ENS Resolver so that we can have a ENS domain that “follows” the Feed described in the manifest.

5.2.2. Feeds API

There are 3 different ways of interacting with Feeds : HTTP API, Golang API and Swarm CLI.

5.2.2.1. HTTP API

5.2.2.1.1. Posting to a Feed

Since Feed updates need to be signed, and an update has some correlation with a previous update, it is necessary to retrieve first the Feed’s current status. Thus, the first step to post an update will be to retrieve this current status in a ready-to-sign template:

  1. Get Feed template

GET /bzz-feed:/?topic=<TOPIC>&user=<USER>&meta=1

GET /bzz-feed:/<MANIFEST OR ENS NAME>/?meta=1

Where:
  • user: Ethereum address of the user who publishes the Feed
  • topic: Feed topic, encoded as a hex string. Topic is an arbitrary 32-byte string (64 hex chars)

Note

  • If topic is omitted, it is assumed to be zero, 0x000…
  • if name=<name> (optional) is provided, a subtopic is composed with that name
  • A common use is to omit topic and just use name, allowing for human-readable topics

You will receive a JSON like the below:

{
  "feed": {
    "topic": "0x6a61766900000000000000000000000000000000000000000000000000000000",
    "user": "0xdfa2db618eacbfe84e94a71dda2492240993c45b"
  },
  "epoch": {
    "level": 16,
    "time": 1534237239
  }
  "protocolVersion" : 0,
}
  1. Post the update

Extract the fields out of the JSON and build a query string as below:

POST /bzz-feed:/?topic=<TOPIC>&user=<USER>&level=<LEVEL>&time=<TIME>&signature=<SIGNATURE>

Where:
  • topic: Feed topic, as specified above
  • user: your Ethereum address
  • level: Suggested frequency level retrieved in the JSON above
  • time: Suggested timestamp retrieved in the JSON above
  • protocolVersion: Feeds protocol version. Currently 0
  • signature: Signature, hex encoded. See below on how to calclulate the signature
  • Request posted data: binary stream with the update data
5.2.2.1.2. Reading a Feed

To retrieve a Feed’s last update:

GET /bzz-feed:/?topic=<TOPIC>&user=<USER>

GET /bzz-feed:/<MANIFEST OR ENS NAME>

Note

  • Again, if topic is omitted, it is assumed to be zero, 0x000…
  • If name=<name> is provided, a subtopic is composed with that name
  • A common use is to omit topic and just use name, allowing for human-readable topics, for example: GET /bzz-feed:/?name=profile-picture&user=<USER>

To get a previous update:

Add an addtional time parameter. The last update before that time (unix time) will be looked up.

GET /bzz-feed:/?topic=<TOPIC>&user=<USER>&time=<T>

GET /bzz-feed:/<MANIFEST OR ENS NAME>?time=<T>

5.2.2.1.3. Creating a Feed Manifest

To create a Feed manifest using the HTTP API:

POST /bzz-feed:/?topic=<TOPIC>&user=<USER>&manifest=1. With an empty body.

This will create a manifest referencing the provided Feed.

Note

This API call will be deprecated in the near future.

5.2.2.2. Go API

5.2.2.2.1. Query object

The Query object allows you to build a query to browse a particular Feed.

The default Query, obtained with feed.NewQueryLatest() will build a Query that retrieves the latest update of the given Feed.

You can also use feed.NewQuery() instead, if you want to build a Query to look up an update before a certain date.

Advanced usage of Query includes hinting the lookup algorithm for faster lookups. The default hint lookup.NoClue will have your node track Feeds you query frequently and handle hints automatically.

5.2.2.2.2. Request object

The Request object makes it easy to construct and sign a request to Swarm to update a particular Feed. It contains methods to sign and add data. We can manually build the Request object, or fetch a valid “template” to use for the update.

A Request can also be serialized to JSON in case you need your application to delegate signatures, such as having a browser sign a Feed update request.

5.2.2.2.3. Posting to a Feed
  1. Retrieve a Request object or build one from scratch. To retrieve a ready-to-sign one:
func (c *Client) GetFeedRequest(query *feed.Query, manifestAddressOrDomain string) (*feed.Request, error)
  1. Use Request.SetData() and Request.Sign() to load the payload data into the request and sign it
  2. Call UpdateFeed() with the filled Request:
func (c *Client) UpdateFeed(request *feed.Request, createManifest bool) (io.ReadCloser, error)
5.2.2.2.4. Reading a Feed

To retrieve a Feed update, use client.QueryFeed(). QueryFeed returns a byte stream with the raw content of the Feed update.

func (c *Client) QueryFeed(query *feed.Query, manifestAddressOrDomain string) (io.ReadCloser, error)

manifestAddressOrDomain is the address you obtained in CreateFeedWithManifest or an ENS domain whose Resolver points to that address. query is a Query object, as defined above.

You only need to provide either manifestAddressOrDomain or Query to QueryFeed(). Set to "" or nil respectively.

5.2.2.2.5. Creating a Feed Manifest

Swarm client (package swarm/api/client) has the following method:

func (c *Client) CreateFeedWithManifest(request *feed.Request) (string, error)

CreateFeedWithManifest uses the request parameter to set and create a Feed manifest.

Returns the resulting Feed manifest address that you can set in an ENS Resolver (setContent) or reference future updates using Client.UpdateFeed()

5.2.2.2.6. Example Go code
// Build a `Feed` object to track a particular user's updates
f := new(feed.Feed)
f.User = signer.Address()
f.Topic, _ = feed.NewTopic("weather",nil)

// Build a `Query` to retrieve a current Request for this feed
query := feeds.NewQueryLatest(&f, lookup.NoClue)

// Retrieve a ready-to-sign request using our query
// (queries can be reused)
request, err := client.GetFeedRequest(query, "")
if err != nil {
    utils.Fatalf("Error retrieving feed status: %s", err.Error())
}

// set the new data
request.SetData([]byte("Weather looks bright and sunny today, we should merge this PR and go out enjoy"))

// sign update
if err = request.Sign(signer); err != nil {
    utils.Fatalf("Error signing feed update: %s", err.Error())
}

// post update
err = client.UpdateFeed(request)
if err != nil {
    utils.Fatalf("Error updating feed: %s", err.Error())
}

5.2.2.3. Command-Line

5.2.2.3.1. Posting to a Feed

To update a Feed with the cli:

swarm feed update [command options] <0x Hex data>

creates a new update on the specified topic
           The topic can be specified directly with the --topic flag as an hex string
           If no topic is specified, the default topic (zero) will be used
           The --name flag can be used to specify subtopics with a specific name.
           If you have a manifest, you can specify it with --manifest instead of --topic / --name
           to refer to the feed
  OPTIONS:
 --manifest value  Refers to the feed through a manifest
 --name value      User-defined name for the new feed, limited to 32 characters. If combined with topic, the feed will be a       subtopic with this name
 --topic value     User-defined topic this feed is tracking, hex encoded. Limited to 64 hexadecimal characters
5.2.2.3.2. Reading Feed status
swarm feed info [command options] [arguments...]

obtains information about an existing Swarm feed
          The topic can be specified directly with the --topic flag as an hex string
          If no topic is specified, the default topic (zero) will be used
          The --name flag can be used to specify subtopics with a specific name.
          The --user flag allows to refer to a user other than yourself. If not specified,
          it will then default to your local account (--bzzaccount)
          If you have a manifest, you can specify it with --manifest instead of --topic / --name / ---user
          to refer to the feed

OPTIONS:
--manifest value  Refers to the feed through a manifest
--name value      User-defined name for the new feed, limited to 32 characters. If combined with topic, it will refer to a subtopic with this name
--topic value     User-defined topic this feed is tracking, hex encoded. Limited to 64 hexadecimal characters
--user value      Indicates the user who updates the feed
5.2.2.3.3. Creating a Feed Manifest

The Swarm CLI allows to create Feed Manifests directly from the console:

swarm feed create is defined as a command to create and publish a Feed manifest.

swarm feed create [command options]

creates and publishes a new feed manifest pointing to a specified user's updates about a particular topic.
          The feed topic can be built in the following ways:
          * use --topic to set the topic to an arbitrary binary hex string.
          * use --name to set the topic to a human-readable name.
              For example --name could be set to "profile-picture", meaning this feed allows to get this user's current profile picture.
          * use both --topic and --name to create named subtopics.
            For example, --topic could be set to an Ethereum contract address and --name could be set to "comments", meaning
            this feed tracks a discussion about that contract.
          The --user flag allows to have this manifest refer to a user other than yourself. If not specified,
          it will then default to your local account (--bzzaccount)

OPTIONS:
--name value   User-defined name for the new feed, limited to 32 characters. If combined with topic, it will refer to a subtopic with this name
--topic value  User-defined topic this feed is tracking, hex encoded. Limited to 64 hexadecimal characters
--user value   Indicates the user who updates the feed

5.2.3. Computing Feed Signatures

  1. computing the digest:
The digest is computed concatenating the following:
  • 1-byte protocol version (currently 0)
  • 7-bytes padding, set to 0
  • 32-bytes topic
  • 20-bytes user address
  • 7-bytes time, little endian
  • 1-byte level
  • payload data (variable length)
  1. Take the SHA3 hash of the above digest
  2. Compute the ECDSA signature of the hash
  3. Convert to hex string and put in the signature field above

5.2.3.1. JavaScript example

var web3 = require("web3");

if (module !== undefined) {
  module.exports = {
    digest: feedUpdateDigest
  }
}

var topicLength = 32;
var userLength = 20;
var timeLength = 7;
var levelLength = 1;
var headerLength = 8;
var updateMinLength = topicLength + userLength + timeLength + levelLength + headerLength;




function feedUpdateDigest(request /*request*/, data /*UInt8Array*/) {
  var topicBytes = undefined;
    var userBytes = undefined;
    var protocolVersion = 0;

    protocolVersion = request.protocolVersion

  try {
    topicBytes = web3.utils.hexToBytes(request.feed.topic);
  } catch(err) {
    console.error("topicBytes: " + err);
    return undefined;
  }

  try {
    userBytes = web3.utils.hexToBytes(request.feed.user);
  } catch(err) {
    console.error("topicBytes: " + err);
    return undefined;
  }

  var buf = new ArrayBuffer(updateMinLength + data.length);
  var view = new DataView(buf);
    var cursor = 0;

    view.setUint8(cursor, protocolVersion) // first byte is protocol version.
    cursor+=headerLength; // leave the next 7 bytes (padding) set to zero

  topicBytes.forEach(function(v) {
    view.setUint8(cursor, v);
    cursor++;
  });

  userBytes.forEach(function(v) {
    view.setUint8(cursor, v);
    cursor++;
  });

  // time is little-endian
  view.setUint32(cursor, request.epoch.time, true);
  cursor += 7;

  view.setUint8(cursor, request.epoch.level);
  cursor++;

  data.forEach(function(v) {
    view.setUint8(cursor, v);
    cursor++;
    });
    console.log(web3.utils.bytesToHex(new Uint8Array(buf)))

  return web3.utils.sha3(web3.utils.bytesToHex(new Uint8Array(buf)));
}

// data payload
data = new Uint8Array([5,154,15,165,62])

// request template, obtained calling http://localhost:8500/bzz-feed:/?user=<0xUSER>&topic=<0xTOPIC>&meta=1
request = {"feed":{"topic":"0x1234123412341234123412341234123412341234123412341234123412341234","user":"0xabcdefabcdefabcdefabcdefabcdefabcdefabcd"},"epoch":{"time":1538650124,"level":25},"protocolVersion":0}

// obtain digest
digest = feedUpdateDigest(request, data)

console.log(digest)

5.3. Manifests

In general manifests declare a list of strings associated with swarm hashes. A manifest matches to exactly one hash, and it consists of a list of entries declaring the content which can be retrieved through that hash. Let us begin with an introductory example.

This is demonstrated by the following example. Let’s create a directory containing the two orange papers and an html index file listing the two pdf documents.

$ ls -1 orange-papers/
index.html
smash.pdf
sw^3.pdf

$ cat orange-papers/index.html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
  </head>
  <body>
    <ul>
      <li>
        <a href="./sw^3.pdf">Viktor Trón, Aron Fischer, Dániel Nagy A and Zsolt Felföldi, Nick Johnson: swap, swear and swindle: incentive system for swarm.</a>  May 2016
      </li>
      <li>
        <a href="./smash.pdf">Viktor Trón, Aron Fischer, Nick Johnson: smash-proof: auditable storage for swarm secured by masked audit secret hash.</a> May 2016
      </li>
    </ul>
  </body>
</html>

We now use the swarm up command to upload the directory to swarm to create a mini virtual site.

Note

In this example we are using the public gateway through the bzz-api option in order to upload. The examples below assume a node running on localhost to access content. Make sure to run a local node to reproduce these examples

swarm --recursive --defaultpath orange-papers/index.html --bzzapi http://swarm-gateways.net/ up orange-papers/ 2> up.log
> 2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d

The returned hash is the hash of the manifest for the uploaded content (the orange-papers directory):

We now can get the manifest itself directly (instead of the files they refer to) by using the bzz-raw protocol bzz-raw:

wget -O- "http://localhost:8500/bzz-raw:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d"

> {
  "entries": [
    {
      "hash": "4b3a73e43ae5481960a5296a08aaae9cf466c9d5427e1eaa3b15f600373a048d",
      "contentType": "text/html; charset=utf-8"
    },
    {
      "hash": "4b3a73e43ae5481960a5296a08aaae9cf466c9d5427e1eaa3b15f600373a048d",
      "contentType": "text/html; charset=utf-8",
      "path": "index.html"
    },
    {
      "hash": "69b0a42a93825ac0407a8b0f47ccdd7655c569e80e92f3e9c63c28645df3e039",
      "contentType": "application/pdf",
      "path": "smash.pdf"
    },
    {
      "hash": "6a18222637cafb4ce692fa11df886a03e6d5e63432c53cbf7846970aa3e6fdf5",
      "contentType": "application/pdf",
      "path": "sw^3.pdf"
    }
  ]
}

Manifests contain content_type information for the hashes they reference. In other contexts, where content_type is not supplied or, when you suspect the information is wrong, it is possible to specify the content_type manually in the search query. For example, the manifest itself should be text/plain:

http://localhost:8500/bzz-raw:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d?content_type="text/plain"

Now you can also check that the manifest hash matches the content (in fact swarm does it for you):

$ wget -O- http://localhost:8500/bzz-raw:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d?content_type="text/plain" > manifest.json

$ swarm hash manifest.json
> 2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d

A useful feature of manifests is that we can match paths with URLs. In some sense this makes the manifest a routing table and so the manifest acts as if it was a host.

More concretely, continuing in our example, when we request:

GET http://localhost:8500/bzz:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d/sw^3.pdf

Swarm first retrieves the document matching the manifest above. The url path sw^3 is then matched against the entries. In this case a perfect match is found and the document at 6a182226… is served as a pdf.

As you can see the manifest contains 4 entries, although our directory contained only 3. The extra entry is there because of the --defaultpath orange-papers/index.html option to swarm up, which associates the empty path with the file you give as its argument. This makes it possible to have a default page served when the url path is empty. This feature essentially implements the most common webserver rewrite rules used to set the landing page of a site served when the url only contains the domain. So when you request

GET http://localhost:8500/bzz:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d/

you get served the index page (with content type text/html) at 4b3a73e43ae5481960a5296a08aaae9cf466c9d5427e1eaa3b15f600373a048d.

Swarm manifests don’t “break” like a file system. In a file system, the directory matches at the path separator (/ in linux) at the end of a directory name:

-- dirname/
----subdir1/
------subdir1file.ext
------subdir2file.ext
----subdir2/
------subdir2file.ext

In swarm, path matching does not happen on a given path separator, but on common prefixes. Let’s look at an example: The current manifest for the theswarm.eth homepage is as follows:

wget -O- "http://swarm-gateways.net/bzz-raw:/theswarm.eth/ > manifest.json

> {"entries":[{"hash":"ee55bc6844189299a44e4c06a4b7fbb6d66c90004159c67e6c6d010663233e26","path":"LICENSE","mode":420,"size":1211,"mod_time":"2018-06-12T15:36:29Z"},
            {"hash":"57fc80622275037baf4a620548ba82b284845b8862844c3f56825ae160051446","path":"README.md","mode":420,"size":96,"mod_time":"2018-06-12T15:36:29Z"},
            {"hash":"8919df964703ccc81de5aba1b688ff1a8439b4460440a64940a11e1345e453b5","path":"Swarm_files/","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"},
            {"hash":"acce5ad5180764f1fb6ae832b624f1efa6c1de9b4c77b2e6ec39f627eb2fe82c","path":"css/","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"},
            {"hash":"0a000783e31fcf0d1b01ac7d7dae0449cf09ea41731c16dc6cd15d167030a542","path":"ethersphere/orange-papers/","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"},
            {"hash":"b17868f9e5a3bf94f955780e161c07b8cd95cfd0203d2d731146746f56256e56","path":"f","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"},
            {"hash":"977055b5f06a05a8827fb42fe6d8ec97e5d7fc5a86488814a8ce89a6a10994c3","path":"i","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"},
            {"hash":"48d9624942e927d660720109b32a17f8e0400d5096c6d988429b15099e199288","path":"js/","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"},
            {"hash":"294830cee1d3e63341e4b34e5ec00707e891c9e71f619bc60c6a89d1a93a8f81","path":"talks/","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"},
            {"hash":"12e1beb28d86ed828f9c38f064402e4fac9ca7b56dab9cf59103268a62a2b35f","contentType":"text/html; charset=utf-8","mode":420,"size":31371,"mod_time":"2018-06-12T15:36:29Z"}
  ]}

Note the path for entry b17868...: It is f. This means, there are more than one entries for this manifest which start with an f, and all those entries will be retrieved by requesting the hash b17868... and through that arrive at the matching manifest entry:

$ wget -O- http://localhost:8500/bzz-raw:/b17868f9e5a3bf94f955780e161c07b8cd95cfd0203d2d731146746f56256e56/

{"entries":[{"hash":"25e7859eeb7366849f3a57bb100ff9b3582caa2021f0f55fb8fce9533b6aa810","path":"avicon.ico","mode":493,"size":32038,"mod_time":"2018-06-12T15:36:29Z"},
            {"hash":"97cfd23f9e36ca07b02e92dc70de379a49be654c7ed20b3b6b793516c62a1a03","path":"onts/glyphicons-halflings-regular.","contentType":"application/bzz-manifest+json","mod_time":"0001-01-01T00:00:00Z"}
 ]}

So we can see that the f entry in the root hash resolves to a manifest containing avicon.ico and onts/glyphicons-halflings-regular. The latter is interesting in itself: its content_type is application/bzz-manifest+json, so it points to another manifest. Its path also does contain a path separator, but that does not result in a new manifest after the path separator like a directory (e.g. at onts/). The reason is that on the file system on the hard disk, the fonts directory only contains one directory named glyphicons-halflings-regular, thus creating a new manifest for just onts/ would result in an unnecessary lookup. This general approach has been chosen to limit unnecessary lookups that would only slow down retrieval, and manifest “forks” happen in order to have the logarythmic bandwidth needed to retrieve a file in a directory with thousands of files.

When requesting wget -O- "http://swarm-gateways.net/bzz-raw:/theswarm.eth/favicon.ico, swarm will first retrieve the manifest at the root hash, match on the first f in the entry list, resolve the hash for that entry and finally resolve the hash for the favicon.ico file.

For the theswarm.eth page, the same applies to the i entry in the root hash manifest. If we look up that hash, we’ll find entries for mages/ (a further manifest), and ndex.html, whose hash resolves to the main index.html for the web page.

Paths like css/ or js/ get their own manifests, just like common directories, because they contain several files.

Note

If a request is issued which swarm can not resolve unambiguosly, a 300 "Multiplce Choices" HTTP status will be returned. In the example above, this would apply for a request for http://swarm-gateways.net/bzz:/theswarm.eth/i, as it could match both images/ as well as index.html

5.4. Encryption

Introduced in POC 0.3, symmetric encryption is now readily available to be used with the swarm up upload command. The encryption mechanism is meant to protect your information and make the chunked data unreadable to any handling Swarm node.

Swarm uses Counter mode encryption to encrypt and decrypt content. When you upload content to Swarm, the uploaded data is split into 4 KB chunks. These chunks will all be encoded with a separate randomly generated encryption key. The encryption happens on your local Swarm node, unencrypted data is not shared with other nodes. The reference of a single chunk (and the whole content) will be the concatenation of the hash of encoded data and the decryption key. This means the reference will be longer than the standard unencrypted Swarm reference (64 bytes instead of 32 bytes).

When your node syncs the encrypted chunks of your content with other nodes, it does not share the the full references (or the decryption keys in any way) with the other nodes. This means that other nodes will not be able to access your original data, moreover they will not be able to detect whether the synchronized chunks are encrypted or not.

When your data is retrieved it will only get decrypted on your local Swarm node. During the whole retrieval process the chunks traverse the network in their encrypted form, and none of the participating peers are able to decrypt them. They are only decrypted and assembled on the Swarm node you use for the download.

More info about how we handle encryption at Swarm can be found here.

Note

Swarm currently supports both encrypted and unencrypted swarm up commands through usage of the --encrypt flag. This might change in the future as we will refine and make Swarm a safer network.

Important

The encryption feature is non-deterministic (due to a random key generated on every upload request) and users of the API should not rely on the result being idempotent; thus uploading the same content twice to Swarm with encryption enabled will not result in the same reference.

Example usage:

swarm up foo.txt
> 4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b
# note the short reference of the unencrypted upload
swarm up --encrypt foo.txt
> c2ebba57da7d97bc4725a542ff3f0bd37163fd564e0298dd87f320368ae4faddd1f25a870a7bb7e5d526a7623338e4e9b8399e76df8b634020d11d969594f24a
# note the longer reference of the encrypted upload
swarm up --encrypt foo.txt
>
e76efd76ef1161e4903acc43b5dc634c02fbba7e5f242c32726e78d4e71ffa9cf5a6ca8a19cbada15f38cac79557a930055d5a465a9f868d07122428267045ba
# note the different reference on the second upload (because of the random encryption key)

5.5. Access Control

Swarm supports restricting access to content through several access control strategies:

  • Password protection - where a number of undisclosed parties can access content using a shared secret (pass, act)

  • Selective access using Elliptic Curve key-pairs:

    • For an undisclosed party - where only one grantee can access the content (pk)
    • For a number of undisclosed parties - where every grantee can access the content (act)

5.5.1. Password protection

The simplest type of credential is a passphrase. In typical use cases, the passphrase is distributed by off-band means, with adequate security measures. Any user that knows the passphrase can access the content.

When using password protection, a given content reference (e.g.: a given Swarm manifest address or, alternatively, a Mutable Resource address) is encrypted using scrypt with a given passphrase and a random salt. The encrypted reference and the salt are then embedded into an unencrypted manifest which can be freely distributed but only accessed by undisclosed parties that posses knowledge of the passphrase.

Password protection can also be used for selective access when using the act strategy - similarly to granting access to a certain EC key access can be also given to a party identified by a password. In fact, one could also create an act manifest that solely grants access to grantees through passwords, without the need to know their public keys.

5.5.2. Selective access using EC keys

A more sophisticated type of credential is an Elliptic Curve private key, identical to those used throughout Ethereum for accessing accounts.

In order to obtain the content reference, an Elliptic-curve Diffie–Hellman (ECDH) key agreement needs to be performed between a provided EC public key (that of the content publisher) and the authorized key, after which the undisclosed authorized party can decrypt the reference to the access controlled content.

Whether using access control to disclose content to a single party (by using the pk strategy) or to multiple parties (using the act strategy), a third unauthorized party cannot find out the identity of the authorized parties. The third party can, however, know the number of undisclosed grantees to the content. This, however, can be mitigated by adding bogus grantee keys while using the act strategy in cases where masking the number of grantees is necessary. This is not the case when using the pk strategy, as it as by definition an agreement between two parties and only two parties (the publisher and the grantee).

Important

Accessing content which is access controlled is enabled only when using a local Swarm node (e.g. running on localhost) in order to keep your data, passwords and encryption keys safe. This is enforced through an in-code guard.

Danger

NEVER (EVER!) use an external gateway to upload or download access controlled content as you will be putting your privacy at risk! You have been fairly warned!

5.5.3. Usage

Creating access control for content is currently supported only through CLI usage.

Accessing restricted content is available through CLI and HTTP. When accessing content which is restricted by a password HTTP Basic access authentication can be used out-of-the-box.

Important

When accessing content which is restricted to certain EC keys - the node which exposes the HTTP proxy that is queried must be started with the granted private key as its bzzaccount CLI parameter.

5.5.4. CLI usage

Important

Restricting access to content on Swarm is a 2-step process - you first upload your content, then wrap the reference with an access control manifest. We recommend that you always upload your content with encryption enabled. In the following examples we will refer the uploaded content hash as REF

Protecting content with a password:

Note

The --password flag when using the pass strategy refers to the password that protects the access-controlled content. This file should contain the password in plaintext. The command expects you to input the uploaded swarm content hash you’d like to limit access to (REF)

$ echo 'mysupersecretpassword' > /path/to/password/file
$ swarm access new pass --password /path/to/password/file <REF>
4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b

The returned hash 4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b is the hash of the access controlled manifest. When requesting this hash through the HTTP gateway you should receive an HTTP Unauthorized 401 error:

$ curl http://localhost:8500/bzz:/4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b

The same request should make an authentication dialog pop-up in the browser. You could then input the password needed and the content should correctly appear.

Requesting the same hash with HTTP basic authentication (password only) would return the content too:

$ curl http://:mysupersecretpassword@localhost:8500/bzz:/4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b

Protecting content with Elliptic curve keys (single grantee):

Note

The pk strategy requires a bzzaccount to encrypt with. The most comfortable option in this case would be the same bzzaccount you normally start your Swarm node with - this will allow you to access your content seamlessly through that node at any given point in time.

Note

Grantee public keys are expected to be in an secp256 compressed form - 66 characters long string (e.g. 02e6f8d5e28faaa899744972bb847b6eb805a160494690c9ee7197ae9f619181db). Comments and other characters are not allowed.

$ swarm --bzzaccount 2f1cd699b0bf461dcfbf0098ad8f5587b038f0f1 access new pk --grant-key 02e6f8d5e28faaa899744972bb847b6eb805a160494690c9ee7197ae9f619181db <REF>
4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b

The returned hash 4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b is the hash of the access controlled manifest.

The only way to fetch the access controlled content in this case would be to request the hash through one of the nodes that were granted access and/or posses the granted private key (and that the requesting node has been started with the appropriate bzzaccount that is associated with the relevant key) - either the local node that was used to upload the content or the node which was granted access through its public key.

Protecting content with Elliptic curve keys and passwords (multiple grantees):

Note

The act strategy requires a bzzaccount to encrypt with. The most comfortable option in this case would be the same bzzaccount you normally start your Swarm node with - this will allow you to access your content seamlessly through that node at any given point in time

Note

the act strategy expects a grantee public-key list and/or a list of permitted passwords to be communicated to the CLI. This is done using the --grant-keys flag and/or the --password flag. Grantee public keys are expected to be in an secp256 compressed form - 66 characters long string (e.g. 02e6f8d5e28faaa899744972bb847b6eb805a160494690c9ee7197ae9f619181db). Each grantee should appear in a separate line. Passwords are also expected to be line-separated. Comments and other characters are not allowed.

$ swarm --bzzaccount 2f1cd699b0bf461dcfbf0098ad8f5587b038f0f1 access new act --grant-keys /path/to/public-keys/file --password /path/to/passwords/file <REF>
4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b

The returned hash 4b964a75ab19db960c274058695ca4ae21b8e19f03ddf1be482ba3ad3c5b9f9b is the hash of the access controlled manifest.

The access controlled content could be accessed in one of the following ways:

  1. Request the hash through one of the nodes that were granted access and/or posses the granted private key (and that the requesting node has been started with the appropriate bzzaccount that is associated with the relevant key) - either the local node that was used to upload the content or one of the nodes which were granted access through their public keys
  2. Request the hash with HTTP authentication using one of the granted passwords

5.5.5. HTTP usage

Accessing restricted content on Swarm through the HTTP API is, as mentioned, limited to your local node due to security considerations. Whenever requesting a restricted resource without the proper credentials via the HTTP proxy, the Swarm node will respond with an HTTP 401 Unauthorized response code.

When accessing password protected content:

When accessing a resource protected by a passphrase without the appropriate credentials the browser will receive an HTTP 401 Unauthorized response and will show a pop-up dialog asking for a username and password. For the sake of decrypting the content - only the password input in the dialog matters and the username field can be left blank.

The credentials for accessing content protected by a password can be provided in the initial request in the form of: http://:<password>@localhost:8500/bzz:/<hash or ens name>

Important

Access controlled content should be accessed through the bzz:// protocol

When accessing EC key protected content:

When accessing a resource protected by EC keys, the node that requests the content will try to decrypt the restricted content reference using its own EC key which is associated with the current bzz account that the node was started with (see the --bzzaccount flag). If the node’s key is granted access - the content will be decrypted and displayed, otherwise - an HTTP 401 Unauthorized error will be returned by the node.

5.6. FUSE

Another way of interacting with Swarm is by mounting it as a local filesystem using FUSE (Filesystem in Userspace). There are three IPC API’s which help in doing this.

Note

FUSE needs to be installed on your Operating System for these commands to work. Windows is not supported by FUSE, so these command will work only in Linux, Mac OS and FreeBSD. For installation instruction for your OS, see “Installing FUSE” section below.

5.6.1. Installing FUSE

  1. Linux (Ubuntu)
sudo apt-get install fuse
sudo modprobe fuse
sudo chown <username>:<groupname> /etc/fuse.conf
sudo chown <username>:<groupname> /dev/fuse
  1. Mac OS

    Either install the latest package from https://osxfuse.github.io/ or use brew as below

brew update
brew install caskroom/cask/brew-cask
brew cask install osxfuse

5.6.2. CLI Usage

The Swarm CLI now integrates commands to make FUSE usage easier and streamlined.

Note

When using FUSE from the CLI, we assume you are running a local Swarm node on your machine. The FUSE commands attach to the running node through bzzd.ipc

One use case to mount a Swarm hash via FUSE is a file sharing feature accessible via your local file system. Files uploaded to Swarm are then transparently accessible via your local file system, just as if they were stored locally.

To mount a Swarm resource, first upload some content to Swarm using the swarm up <resource> command. You can also upload a complete folder using swarm –recursive up <directory>. Once you get the returned manifest hash, use it to mount the manifest to a mount point (the mount point should exist on your hard drive):

swarm fs mount --ipcpath <path-to-bzzd.ipc> <manifest-hash> <mount-point>

For example:

swarm fs mount --ipcpath /home/user/ethereum/bzzd.ipc <manifest-hash> /home/user/swarmmount

Your running Swarm node terminal output should show something similar to the following in case the command returned successfuly:

Attempting to mount /path/to/mount/point
Serving 6e4642148d0a1ea60e36931513f3ed6daf3deb5e499dcf256fa629fbc22cf247 at /path/to/mount/point
Now serving swarm FUSE FS                manifest=6e4642148d0a1ea60e36931513f3ed6daf3deb5e499dcf256fa629fbc22cf247 mountpoint=/path/to/mount/point

You may get a “Fatal: had an error calling the RPC endpoint while mounting: context deadline exceeded” error if it takes too long to retrieve the content.

In your OS, via terminal or file browser, you now should be able to access the contents of the Swarm hash at /path/to/mount/point, i.e. ls /home/user/swarmmount

Through your terminal or file browser, you can interact with your new mount as if it was a local directory. Thus you can add, remove, edit, create files and directories just as on a local directory. Every such action will interact with Swarm, taking effect on the Swarm distributed storage. Every such action also will result in a new hash for your mounted directory. If you would unmount and remount the same directory with the previous hash, your changes would seem to have been lost (effectively you are just mounting the previous version). While you change the current mount, this happens under the hood and your mount remains up-to-date.

To unmount a swarmfs mount, either use the List Mounts command below, or use a known mount point:

swarm fs unmount --ipcpath <path-to-bzzd.ipc> <mount-point>
> 41e422e6daf2f4b32cd59dc6a296cce2f8cce1de9f7c7172e9d0fc4c68a3987a

The returned hash is the latest manifest version that was mounted. You can use this hash to remount the latest version with the most recent changes.

To see all existing swarmfs mount points, use the List Mounts command:

swarm fs list --ipcpath <path-to-bzzd.ipc>

Example Output:

Found 1 swarmfs mount(s):
0:
        Mount point: /path/to/mount/point
        Latest Manifest: 6e4642148d0a1ea60e36931513f3ed6daf3deb5e499dcf256fa629fbc22cf247
        Start Manifest: 6e4642148d0a1ea60e36931513f3ed6daf3deb5e499dcf256fa629fbc22cf247

5.7. BZZ URL schemes

Swarm offers 6 distinct URL schemes:

5.7.1. bzz

The bzz scheme assumes that the domain part of the url points to a manifest. When retrieving the asset addressed by the URL, the manifest entries are matched against the URL path. The entry with the longest matching path is retrieved and served with the content type specified in the corresponding manifest entry.

Example:

GET http://localhost:8500/bzz:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d/readme.md

returns a readme.md file if the manifest at the given hash address contains such an entry.

$ ls
readme.md
$ swarm --recursive up .
c4c81dbce3835846e47a83df549e4cad399c6a81cbf83234274b87d49f5f9020
$ curl http://localhost:8500/bzz-raw:/c4c81dbce3835846e47a83df549e4cad399c6a81cbf83234274b87d49f5f9020/readme.md
## Hello Swarm!

Swarm is awesome%

If the manifest does not contain an file at readme.md itself, but it does contain multiple entries to which the URL could be resolved, e.g. in the example above, the manifest has entries for readme.md.1 and readme.md.2, the API returns an HTTP response “300 Multiple Choices”, indicating that the request could not be unambiguously resolved. A list of available entries is returned via HTTP or JSON.

$ ls
readme.md.1 readme.md.2
$ swarm --recursive up .
679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463
$ curl -H "Accept:application/json" http://localhost:8500/bzz:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/readme.md
{"Msg":"\u003ca href='/bzz:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/readme.md.1'\u003ereadme.md.1\u003c/a\u003e\u003cbr/\u003e\u003ca href='/bzz:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/readme.md.2'\u003ereadme.md.2\u003c/a\u003e\u003cbr/\u003e","Code":300,"Timestamp":"Fri, 15 Jun 2018 14:48:42 CEST","Details":""}
$ curl -H "Accept:application/json" http://localhost:8500/bzz:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/readme.md | jq
{
    "Msg": "<a href='/bzz:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/readme.md.1'>readme.md.1</a><br/><a href='/bzz:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/readme.md.2'>readme.md.2</a><br/>",
    "Code": 300,
    "Timestamp": "Fri, 15 Jun 2018 14:49:02 CEST",
    "Details": ""
}

bzz scheme also accepts POST requests to upload content and create manifest for them in one go:

$ curl -H "Content-Type: text/plain" --data-binary "some-data" http://localhost:8500/bzz:/
635d13a547d3252839e9e68ac6446b58ae974f4f59648fe063b07c248494c7b2%
$ curl http://localhost:8500/bzz:/635d13a547d3252839e9e68ac6446b58ae974f4f59648fe063b07c248494c7b2/
some-data%
$ curl -H "Accept:application/json" http://localhost:8500/bzz-raw:/635d13a547d3252839e9e68ac6446b58ae974f4f59648fe063b07c248494c7b2/ | jq .
{
    "entries": [
        {
            "hash": "379f234c04ed1a18722e4c76b5029ff6e21867186c4dfc101be4f1dd9a879d98",
            "contentType": "text/plain",
            "mode": 420,
            "size": 9,
            "mod_time": "2018-06-15T15:46:28.835066044+02:00"
        }
    ]
}

5.7.2. bzz-raw

GET http://localhost:8500/bzz-raw:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d

When responding to GET requests with the bzz-raw scheme, Swarm does not assume that the hash resolves to a manifest. Instead it just serves the asset referenced by the hash directly. So if the hash actually resolves to a manifest, it returns the raw manifest content itself.

E.g. continuing the example in the bzz section above with readme.md.1 and readme.md.2 in the manifest:

$ curl http://localhost:8500/bzz-raw:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/ | jq
{
    "entries": [
        {
        "hash": "efc6d4a7d7f0846973a321d1702c0c478a20f72519516ef230b63baa3da18c22",
        "path": "readme.md.",
        "contentType": "application/bzz-manifest+json",
        "mod_time": "0001-01-01T00:00:00Z"
        }
    ]
    }
$ curl http://localhost:8500/bzz-raw:/efc6d4a7d7f0846973a321d1702c0c478a20f72519516ef230b63baa3da18c22/ | jq
{
    "entries": [
        {
            "hash": "d0675100bc4580a0ad890b5d6f06310c0705d4ab1e796cfa1a8c597840f9793f",
            "path": "1",
            "mode": 420,
            "size": 33,
            "mod_time": "2018-06-15T14:21:32+02:00"
        },
        {
            "hash": "f97cf36ac0dd7178c098f3661cd0402fcc711ff62b67df9893d29f1db35adac6",
            "path": "2",
            "mode": 420,
            "size": 35,
            "mod_time": "2018-06-15T14:42:06+02:00"
        }
    ]
    }

The content_type query parameter can be supplied to specify the MIME type you are requesting, otherwise content is served as an octet-stream per default. For instance if you have a pdf document (not the manifest wrapping it) at hash 6a182226... then the following url will properly serve it.

GET http://localhost:8500/bzz-raw:/6a18222637cafb4ce692fa11df886a03e6d5e63432c53cbf7846970aa3e6fdf5?content_type=application/pdf

bzz-raw also supports POST requests to upload content to Swarm, the response is the hash of the uploaded content:

$ curl --data-binary "some-data" http://localhost:8500/bzz-raw:/
379f234c04ed1a18722e4c76b5029ff6e21867186c4dfc101be4f1dd9a879d98%
$ curl http://localhost:8500/bzz-raw:/379f234c04ed1a18722e4c76b5029ff6e21867186c4dfc101be4f1dd9a879d98/
some-data%

5.7.3. bzz-list

GET http://localhost:8500/bzz-list:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d/path

Returns a list of all files contained in <manifest> under <path> grouped into common prefixes using / as a delimiter. If no path is supplied, all files in manifest are returned. The response is a JSON-encoded object with common_prefixes string field and entries list field.

$ curl http://localhost:8500/bzz-list:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/ | jq
{
    "entries": [
        {
            "hash": "d0675100bc4580a0ad890b5d6f06310c0705d4ab1e796cfa1a8c597840f9793f",
            "path": "readme.md.1",
            "mode": 420,
            "size": 33,
            "mod_time": "2018-06-15T14:21:32+02:00"
        },
        {
            "hash": "f97cf36ac0dd7178c098f3661cd0402fcc711ff62b67df9893d29f1db35adac6",
            "path": "readme.md.2",
            "mode": 420,
            "size": 35,
            "mod_time": "2018-06-15T14:42:06+02:00"
        }
    ]
    }

5.7.4. bzz-hash

GET http://localhost:8500/bzz-hash:/theswarm.eth/

Swarm accepts GET requests for bzz-hash url scheme and responds with the hash value of the raw content, the same content returned by requests with bzz-raw scheme. Hash of the manifest is also the hash stored in ENS so bzz-hash can be used for ENS domain resolution.

Response content type is text/plain.

$ curl http://localhost:8500/bzz-hash:/theswarm.eth/
7a90587bfc04ac4c64aeb1a96bc84f053d3d84cefc79012c9a07dd5230dc1fa4%

5.7.5. bzz-immutable

GET http://localhost:8500/bzz-immutable:/2477cc8584cc61091b5cc084cdcdb45bf3c6210c263b0143f030cf7d750e894d

The same as the generic scheme but there is no ENS domain resolution, the domain part of the path needs to be a valid hash. This is also a read-only scheme but explicit in its integrity protection. A particular bzz-immutable url will always necessarily address the exact same fixed immutable content.

$ curl http://localhost:8500/bzz-immutable:/679bde3ccb6fb911db96a0ea1586c04899c6c0cc6d3426e9ee361137b270a463/readme.md.1
## Hello Swarm!

Swarm is awesome%
$ curl -H "Accept:application/json" http://localhost:8500/bzz-immutable:/theswarm.eth/ | jq .
{
    "Msg": "cannot resolve theswarm.eth: immutable address not a content hash: \"theswarm.eth\"",
    "Code": 404,
    "Timestamp": "Fri, 15 Jun 2018 13:22:27 UTC",
    "Details": ""
}

5.7.6. bzz-resource

bzz-resource allows you to receive hash pointers to content that the ENS entry resolved to at different versions

bzz-resource://<id> - get latest update bzz-resource://<id>/<n> - get latest update on period n bzz-resource://<id>/<n>/<m> - get update version m of period n <id> = ens name