I’m fond of data-URI’s (MDN Link). 12 years ago I reappropriated a tool that stored a webpage with its related resources in a Microsoft specific format and rewrote it into something that would store it in normal HTML where the related resources were encoded in data URI’s. Recently the topic came up again at a project I was working in, where microservices are still a thing. And while discussing it with colleagues it seemed as if knowledge about this quite useful URI-scheme wasn’t on top of everyone else’s mind. Instead, the original idea was, we could upload the resource to S3, pass the link, download the resource from S3 at the receiving end, and then have some policy that takes care of deleting it… nah…
This is the most simple data-URI:
data:,Hello%2C%20World%21
You can open it in your browser.
The example above is string encoded. For binary images, or more complex documents, base64 encoding is used.
data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==
The above results in the exact same output; open it in your browser.
Next, how to do this with ruby
.
Encoding is pretty simple. We take the content type (the mime type), the data (binary or string), and set whether base64 should be selected for encoding.
require "base64"
def encode_data_uri(content_type, data, encoding = nil)
metadata = content_type
metadata += ";#{encoding}" if encoding
data = if encoding == 'base64'
Base64.encode64(data)
else
URI.encode(data)
end
"data:#{metadata},#{data}"
end
Decoding needs to be a bit more robust. This script is a minimal implementation, and could be improved with encoding for text. For now we hard-assume UTF-8 for all text (ruby itself defaults to UTF-8), but ideally it is retrieved from the document data (e.g. meta headers). That’s up for a follow up exercise.
require "base64"
def parse_data_uri(uri)
return false unless uri.start_with?("data:")
# Remove the "data:" prefix & split the metadata and data parts
metadata, data = uri[5..-1].split(',', 2)
# Extract content type and encoding
content_type, encoding = metadata.split(';', 2)
data = URI.decode(data)
# Decode the data part based on the encoding
if encoding == 'base64'
data = Base64.decode64(data)
end
if content_type&.start_with?('text/') || content_type&.end_with?('xml')
data = data.force_encoding('UTF-8')
end
{
src: uri,
content_type: content_type,
encoding: encoding,
data: data
}
end
As shown above, using of a simple data URI encoding mechanism. The data urls are text. They can be sent even be used as data carriers in JSON (despite that you’d be able to agree on base encoding the contents directly, it is nice agreed upon encoding scheme including mime type data), or as I did before, use it to store HTML file with all JS, images and stuff embedded.
Enjoyed this? Follow me on Mastodon or add the RSS, euh ATOM feed to your feed reader.
Dit artikel van murblog van Maarten Brouwers (murb) is in licentie gegeven volgens een Creative Commons Naamsvermelding 3.0 Nederland licentie .