Run these 4 minimized cells to load the styles & scripts to pimp the notebook and execute the full demo.

import qualified IHaskell.Display as ID
-- Basic CORS parameter definitions
-- Wrap Json Web Keys for JWT in easy-to-use functions for main logic.
-- Create some Runtime parameters for neater main logic.

Full Stack Deep Dive: What's the deal with Haskell?

No description has been provided for this image is one of those languages that senior developers will have heard mysterious tales of. But as most stuff of legends, there are few people that can confirm or contradict the tales based on practical experience with Haskell. That is unfortunate: it makes a lot of sense for full-stack developers and CTO to count Haskell familiarity as part of their standard skillset.

Why Haskell now? Simpler parts of software development projects are today primarely implemented by hoards of junior and mid-range coders that use Javascript, Python, Java and C#, together with their respective frameworks. Overall scarcity of programmers influence the rest of the application architecture, forcing it to match closely the same ecosystems. The significant change coming is statistical programming by large language models (LLMs), that will soon generate automatically most of the simpler code. Thus simple code is about to disappear from human developers sight, and at the same time a significant part of the coding community is about to disappear! Very likely this will remove a lot of the constraints on the ecosystems to develop in. Yet LLMs are not likely to solve the design of complex applications and the code evolution over projects lifespan for quite some time. Typical key issues software developers will still need to deal with in the foreseable future are:

  • bug-free logic,
  • assessable information security,
  • distributed execution
  • efficient evolution of complex data structures,
  • continuous integration of evolving code
These complex problems call for tools that aren't popular because they are easily accessed but rather because they are absolutely optimal no matter their learning curve. Senior developers and CTOs will be free from having to work with and support current simplistic ecosystems and instead apply their efforts and skills at mastering the powerful but currently exoteric solutions such as Elixir, Haskell, OCaml, Rust, etc. I expect Javascript, Python and the other beginner friendly ecosystems to transition to runtime environment for AI-generated code by tools like Vercel's V0, until AI-optimized runtime environments come along.

That's why I believe Haskell's moment to shine is coming up. In my experience, it's THE tool to reach the Zen of programming, to get a balance between theory and real-life implementations, to achieve quick prototyping and long-term software designs, to use time-tested techniques and innovative ideas. I've not come to that reflection overnight... I have over 40 years of software development experience under my belt, during which I've worked with many many programming ecosystems in many many kind of software development projects with many many types of people. I got curious about Haskell over 10 years ago, slowly integrated its ecosystem into my daily workflow and especially over the last 5 years I've been able to use it efficiently for modern web application development.

No description has been provided for this image

Hopefully you are now curious about Haskell... I wrote this Jupyter notebook to help you bootstrap, to get a feel for the applicability of the Haskell ecosystem in modern software development as quickly as possible. My goal is to show you that this is not wishful thinking, but that if you do spend time learning Haskell you'll be able to apply this knowledge in daily situations for getting the job done efficiently.

This introduction to Haskell in a Jupyter notebook does away with the need to set up a development environment, to compile and run a first simple piece of code. Instead the notebook is loaded with code that works using a Haskell interpreter kernel for Jupyter. That gives you the quickest path to experiencing interactively the Haskell ecosystem to implement a normal, modern, efficient web app based, using HTMX and Tailwind on the frontend, server-side rendering of templates filled with data provided by SQL queries, and low-latency websocket communications on the backend. Plus security, distributed computing, 3rd party APIs, Javascript/C++/Python integration, etc.

Using a Jupyter notebook also means that you need to run all cells in order to get code to work out, and that some issues may happen with CSS styling, HTML rendering and sub-process port management. You will get a better experience by using a Docker image containing all notebook resources on your own environment.

If I manage to get you interested in Haskell with this notebook, your next step will be to develop in your own environment. It is actually easy to get the Haskell ecosystem on your local machine, read some of the many resources to learn how to write Haskell code with language server and Copilot support. You'll quickly be creating software through the edit -- compile -- run -- observe loop and reap the benefits of the ecosystem. Further more I hope to grow the What's up, FUDD? community as a group focused on improving the Haskell ecosystem for web applications.

No description has been provided for this image

Basic ideas about Haskell

The following elements of the Haskell programming language influence the design of libraries and how one codes in its ecosystem:
  • it has a simple and clean syntax: quick to learn with little decorations and high readability.
  • it is properly functional: Haskell guts are based on System F, the foundation of high-order functional logic,
  • it enforces immutability: a variable is assigned only once (or as the functional people would say: a term is bound to a name once only),
  • it has a strong support for pure logic without side effects: logic provided with the same inputs will always produce the same results, as theoritical computing would have it,
  • it has a consistent support for practical computing: breaking the principles just mentioned for real-life software is cleverly integrated in the framework,
  • it has a very powerful typing system that is also strongly enforced: anything is typed (even types themselves!), type matching must be exact for code to be valid, there are types of types, type manipulations, etc,
  • the typing system is powerful enough that it disappears from normal use: you write your code without types (a la JS) and still everything turns out to be consistant and mostly bug-free (unlike JS!),
  • meta-programming is easily accessed: the GHC compiler contains a VM and can execute some code while it compiles some other code, leading to on-the-fly code generation, value derivation, interaction with the outside world, etc,
  • it is extendable: the compiler is introspective and open to express domain-specific ideas, eg embedded SQL,
  • it is opened to other languages: a powerful FFI provides an easy integration of existing C/C++ code,
  • it is optimizable: all kind of directives are available to tune the compiler to produce efficient machine code, and Linear Types can define very performant memory management like in Rust.

Additionally the GHC runtime system brings the following to the table:

  • it puts an emphasis on compiled binaries, but you'll see with this notebook that it works just fine in interpreted mode,
  • it provides a memory-management support (a garbage collector) that is very configurable,
  • it is lazy: by default functions are not executed if they don't need to; on the upside this can provide significant optimizations, and on the downside it can create significant memory requirements,
  • it is parallel: the concept of multi-threading is built-in, and thread-safe code execution can happen in all kind of levels of parallelism depending on the underlying hardware,
  • it is debuggable: the interpreter is the main way of debugging Haskell code, but GDB is also able to step through logic execution,
  • it is observable: execution performance is recordable, with statistics providing valuable insights on optimizing execution time and/or resources utilisation, and the heap can be introspected by a socket-connected debugger,
  • it is self-contained: binaries can be statically linked, so a single executable file is all that's needed to run an application.

Reading Haskell code

Ok, enough with the overview, let's get practical. But before writing a minmal web app, you'll need to be able to read some Haskell code:
No description has been provided for this image

Just kidding, it's pretty simple. The main idea with Haskell syntax is that in general you write code in the following forms:

  • function applications: functionX argA argB
  • bindings: left_hand_side = right_hand_side
  • access to existing logic (packages): import AGreatPackage.BestModule (fctA, fctB)
  • name space creation (modules): module TopNamespace.SubNameSpace (exportA, exportB)

Let's start with some baby code. Here is a infix function application:

In [ ]:
1 + 2
3

This is what one expects: the + function is applied to 1 and 2. In other language the + character may be a reserved symbol, but in Haskell it's just another function that has been defined in a core library. You'll see that some developers really enjoy this freedom of expression and come up with very original names, like .:.: or :>>:.

Any function can be called in prefix or infix mode, so the previous line is equivalent to:

In [2]:
(+) 2 3
5

Another typical example of baby code is the function application of the is-equal function:

In [8]:
1 == 2
False

That function returned the Boolean literal False.

Moving on, there's the right side binding to the left side, which in other languages is typically an assignment statmement. The baby code version of that is:

In [9]:
x1 = 2 + 3

That simply says that x1 takes the value of the + function applied to values 2 and 3. It's almost really simple, and in a non-lazy language you'll expect this to be converted to 6. But Haskell is lazy, so deep down in the execution system it means that x1 is going to be stay the unevaluated term 2 + 3 until x1 needs to be used. Based on this example you may think it is completely useless, but if instead the code of x1 takes hour of CPU time to return a value or terabytes of RAM to perform, it's great that none of these resources gets utilized until it's necessary to have a resulting value. And as laziness is not always welcomed, it is possible to get immediate evaluation on an expression basis.

As Haskell is a functional language, remember that x1 = 2 + 3 is actually a function definition where x1 takes no argument and works with constants. In a pure immutable functional language, there are no variables, just function applications and term bindings...

The non-functional programmer will be more comfortable with the following function definition:

In [10]:
incByOne arg_1 = 1 + arg_1

This code defines a function called incByOne which takes one argument. When executed, it will add 1 to the arg_1 argument, no surprise here.

In Haskell the function's arguments are simply listed one after the other. There is no need for other decoration in the syntax. Here's a function that takes 2 arguments, and adds them together:

In [11]:
yetAnotherAdd arg_1 arg_2 = arg_1 + arg_2

Using that function is also baby code:

In [12]:
yetAnotherAdd 1 2
3

Lambda notation is bread and butter in Haskell, so that function can be rewritten with the following syntax:

In [13]:
aLambdaVersion = \arg_1 arg_2 -> arg_1 + arg_2
Redundant lambda
Found:
aLambdaVersion = \ arg_1 arg_2 -> arg_1 + arg_2
Why Not:
aLambdaVersion arg_1 arg_2 = arg_1 + arg_2
Avoid lambda
Found:
\ arg_1 arg_2 -> arg_1 + arg_2
Why Not:
(+)

You should see some message from GHC suggesting you avoid lambda. That's because defining a named function with the lambda notation is not as clear to read, so the wise men managing Haskell elegance will let you know you're better using the first form.

Finally Haskell does all the expected tricks with curryfication:

In [14]:
addSynonym = (+)
addTwo = (+ 2)

addSynonym 2 3
addTwo 3
5
5

Here addSynonym is a function with two implied arguments as it calls the + function (in prefix mode). The addTwo function has one implied argument, since the + function is provided with a value. Advanced Haskell programmers will often use the curryfication as in addTwo to remove redundancy in code, but I often find that it makes it harder to read and I prefer some redundancy.

Note that up to now we haven't seen any typing anywhere, just like in Javascript. Given Haskell is strongly-typed and has a very powerful typing system, where are all the types?

Ok, let's introduce some typing notation. First, a term can be given a specific type. Taking a previous example, we can specify that the 2 + 3 calculation will result in an Int value (the :: mean of type):

In [15]:
x1 = 2 + 3 :: Int

That tells the compiler that the right-hand-side value is of type Int. The compiler will check that the + function and the 2 and 3 values all make sense, and then also give Int type to x1. This is pretty obvious to anyone who has experience with typed languages. But given that Haskell typing system supports polymorphism, it means that + may exist for many different types (and it does!), and 2 and 3 literals may be representing different kind of intergers (16 bits, 32 bits, unbounded, ...). It is possible that in the first definition of x1, the resulting type was a more general kind of integer. Adding :: Int will lock in the meaning of x1 and force its uses to be limited to Int applications.

Important Note Haskell enforces that type names must start with an upper-case letter, while variable and function names must start with a lower-case letter or a non-letter symbol such as +.

Programmers usually prefer to provide the typing information for the right-hand-side, as in:

In [16]:
x1 :: Int
x1 = 2 + 3

That looks more like a good old C or Java variable definition, where the variable is first declared together with a type, and the value that it represents is provided later and must be of the same type.

While we can often disregard typing information, typically Haskell programmers will specify the typing of functions as a way to documentent the intent of the code as well as locking in certain constraints on the arguments.

For example the following piece of code starts by clearly stating the intent of the developer on what the showOneMore function is expecting to work with: an Int argument (we know it's an argument because it's before the ->). The String part specifies that it will return a String value. When reading code, that's often all the information required to move on and the logic of the function can be skipped. Most popular languages these days have typing information as part of its argument specification, but if you think about it, the Haskell approach is simpler and more consistent in a typing-is-optional situation...

In [17]:
showOneMore :: Int -> String
showOneMore arg_1 = show (1 + arg_1)

The typing for a function with two arguments is simply going to list them all, separated by arrows:

In [18]:
yetAnotherAdd :: Int -> Int -> Int
yetAnotherAdd arg_1 arg_2 = arg_1 + arg_2

Why the arrows as separator and the lack of difference between arguments and the result in the typing syntax? The arrows are from the deep theoritical foundations of Haskell: lambda calculus has used that notation for ... a long time. The lack of difference between arguments and result is again consistency and simplicity of syntax: a function that takes no argument is types as :: Result_Type (like the x1 seend before), a function that takes an argument will add a type and arrow before that, which looks like Arg_Type -> Result_Type, and more arguments will add more Arg_Type ->. Note that curryfication goes the other way: for each value passed to a function, there's one less Arg_Type -> to deal with.

The Hello World example.

At this point there's more than enough syntax knownledge to move on to the proverbial 'Hello World' example. First, here's the code:

In [19]:
main =
  putStrLn "Hello World!!"

Haskell runtime follows the long tradition of the main function being the entry point for an application.

The putStrLn function is part of the basic Haskell library; it takes a string (which is provided in literal form by using the double-quote character), and sends it to the stdout of the running program.

In the Jupyter Notebook, we can simply call the function and the Haskell kernel will redirect its stdout to be the output in the result cell:

In [20]:
main
Hello World!!

Accessing existing logic

Moving forward, let's introduce syntax to access existing code units: the import statement. It instructs the compiler to go fetch code from some other place. That was a simple concept in the good old days, but today it means resolving lots of sources and constraints. In Haskell, a unit of code is called a module, and it is a hierchical concept. Modules are stored in packages, which is the equivalent of libraries for C++, crates for Rust, packages (yeah!) for Javascript, and so on. The GHC compiler provided some package management tools early on (ghc-pkg); then more and more features have come up in that area over the years (cabal, stack, ...). Stackage is the main site for curated package distributions and Hackage is the largest site. There are much less efforts in Hackage to curate the quality and stability of its content.

Important Note Haskell enforces that module names start with an upper-case letter, just like types.

Bringing in the entire content of a module in the current code namespace is the simplest form of import:

In [21]:
import Data.Char

That code means that everything defined in the Data.Char package is now part of the current code namespace. Of course that leads to collisions and conflicts, so it is typical to control what is brought in the namespace and also add some prefix to what is imported, as in:

In [22]:
import qualified Text.Blaze.Html5 as H
import qualified Text.Blaze.Html5.Attributes as A

This code adds two modules to the current code namespace but we must use the prefix H and A to access whatever is defined within these two modules.

It turns out that these two modules are part of the Blaze package, which provides all elements and attributes of HTML as functions of the same name. That's useful in web application development, so let's use that package.

Wait, I forgot two other syntax elements that you must know about. First, the oh-so-powerful comment! Same-line comments are written by prefixing them with two dashes, while the curly brackets and dashes provide multi-line comments, as in:

In [23]:
-- Nothing impacts the compiler here...
{- and
  same
  here
-}

The second syntax element to know of is for providing compiler directives (pragmas): the GHC compiler has adopted a special version of curly-bracket comment, and we'll use that right now to turn on a processing mode that unfortunately isn't on by default, the Overloaded String mode:

In [24]:
{-# LANGUAGE OverloadedStrings #-}

Adding processing modes when using the interpreter can also be done with an internal command:

In [25]:
:set -XOverloadedStrings

The OverloadedStrings mode means that the compiler will do polymorphism resolution of string literals. In practical terms it means we can let the compiler figure out the right logic to transform a string literal into a Html value that Blaze wants for most of its DOM building functions.

Ok, now let's use Blaze functions to send some HTML into the notebook cells (remember, the H and A prefix are required by choice):

In [26]:
H.div H.! A.class_ "text-red-400 bg-gray-700" H.! A.style "border-style: groove; border-color: red" $ do
    H.p "This is an example of creating DOM elements similarly to JSX or (better) Elm..."
    H.div H.! A.style "display: block; margin-left: auto; margin-right: auto; width: 30%" $ H.img H.! A.src "https://pbs.twimg.com/media/F6ezdRuWMAAVJfT?format=png&name=900x900"
    H.b "Hello"

This is an example of creating DOM elements similarly to JSX or (better) Elm...

No description has been provided for this image
Hello

Note: having CSS styling to work as expected in the Jupyter notebook is still magic to me. You need to make sure that you executed the first cell at the very beginning of the notebook to import TailwindCSS. But there are some issues in Jupyter and your mileage will vary on all the CSS working out (eg for me dark mode has never properly rendered within a cell).

There are two more syntax elements introduced in this example that need to be covered before looking at the code in details.

The first one is used constantly: the $ symbol. It is a short-code for putting the rest of an expression in parenthesis. It is like writing a ( and not bothering with the matching )...

So f1 $ 1 + 2 is the same as f1 (1 + 2), and f1 $ f2 $ 1 + 2 is the same as f1 (f2 (1 + 2)).

The second one is another classic: the do term. It is a short-hand notation for sequencing of a block of functions. Function sequencing is a very long topic that brings in the famous Monad type, so we'll stick to using it rather than explaining it.

So the code:

do
  f1
  ...
  fN

means that first f1 function will be called, and then f2 will be called, and so on, and finally the result of fN will be passed as the result of the whole do block.

Remember than parallelism of logic is implied in Haskell and without enforcing a given sequencing the DOM elements could show up in any order based on optimization, as they have no relationship between each other. In a do block, the textual order is the execution order.

Note that while monads are better known as a scare tactic associated with Haskell, they are just a computing pattern that is universal to programming. In fact the Promise of Javascript is exactly a monad construct! But unlike Haskell, Javascript has no generalized monad system and there's specialized syntax, like await, async and .then(...) to work with it.

Now let's inspect the code that generated HTML in the cell. First note that all the HTML elements have an equivalent function in Blaze. Elements-replicating functions take a Html value and returns a Html value, while attributes-replicating functions take additional parameters and return a Html value. The IHaskell kernel has a trick to automatically merge the DOM elements from a resulting Html value into the notebook's cell, and that's why we see rendered HTML rather than some other form of representation. Without that trick, we could see:

<div class='text-red-400 bg-gray-700' style="border-style: groove; border-color: red"><p>This ...

as a text literal, or maybe a JSON representation with lots of escaped characters.

Second, note this is a typical Haskell learning curve situation. The Blaze package is very efficient and simple to use, two great qualities that usually don't go together. But to accomplish that it uses advanced concepts (Monad, combinators, ...) and you'll stay a basic user of the package unless you go through the learning curve of monoids, semigroups, functors, and so on to properly understand how to get all the juice of Blaze or to add new feature to it.

Let's now look at the code: a div function from the Text.Blaze.Html5 module will generate the enclosing DOM element. Further more a style function creates the attribute specification (red color), and the ! function links div and style, and then receives the result of the do block. The ! function? Yes, that's a valid function name in Haskell, and the Blaze programmers have used it...

In that block, each line calls a function that respectively generates img, p and b DOM elements, with some attributes (! calls) and content (string literals). Using Blaze functions is very similar from writing template code for backend HTML generation. But with all the advantages of Haskell language.

Note: Rewriting this code without the module prefix gives:

div ! class_ "text-red-400 bg-gray-700" $ do
p "This is an example of creating DOM elements similarly to JSX or (better) Elm..."
div ! class_ "rounded-md pb-3 pl-3" $ img ! src "https://pbs.twimg.com/..." ! width "150"
    b "Hello"

which is much more comfortable to read. In real-life you'll use an IDE with Haskell language server (HLS) and drop all that prefixing... but in this notebook we'll continue to use the prefixes to explicitely show the origin of functions.

Now let's say we want to make some components a la React to build up a HTML page section in a modular fashion. We'd go with something like:

In [27]:
import qualified Data.Text as T

-- define two components that take some parameters:
componentA width postfix = do
  H.p "This is an example of creating DOM elements similarly to JSX or (better) Elm..."
  H.img H.! A.src (H.textValue . T.pack $ "https://pbs.twimg.com/media/" ++ postfix) H.! A.style "display: block; margin-left: auto; margin-right: auto;" H.! A.width (H.textValue . T.pack . show $ width)

componentB someTxt color =
  H.b H.! A.class_ (H.textValue . T.pack $ "text-" ++ color) $ someTxt

-- then apply to get the main div by using the components:
H.div H.! A.class_ "text-red-400 bg-gray-700" $ do
  componentA 150 "F6ezdRuWMAAVJfT?format=png&name=900x900"
  componentB "Hello" "orange-200"

This is an example of creating DOM elements similarly to JSX or (better) Elm...

No description has been provided for this image Hello

Fairly simple, and at this point you should be able to read most of the code.

A few more things are introduced in this example:

  • the ++ function that concatenates two String values (it is an historical synonym for the more general concatenation function <>).
  • the Data.Text module that provides the pack function to convert String to Text values. The Text type is a more C++-like implementation of strings, while the String type is a basic list implementation.
  • the . operator, which concatenates functions, so func1 . func2 $ a means func1 (func2 a).

Previously when using a string literal such as "something", the conversion from a String value to a Text value to a Html value was done automatically by the compiler through the OverloadedStrings mode. But the compiler doesn't do the work automatically when we're using variables. We must then specify the conversion, and H.textValue . t.pack $ <something> is more consistent with functional code esthetics than H.textValue (t.pack <something>).

We'll take a break from cool capabilities of the Blaze package and go back to introducing three more important syntaxic constructs.

The first one is the case construct, which is the workhorse of conditional logic. It looks like:

In [28]:
aValue = 2

case aValue of
  1 -> "one"
  2 -> "two"
  3 -> "three"
  _ -> "something different than 1, 2 or 3"
"two"

This kind of construct exists in most modern language, for example the switch in C and JS or the match in Rust. The selector is on the right side of the arrow and the code to execute is on the left side of the arrow. The first matching selector for a given value in the case part becomes the code that executes. The _ symbol matches anything so it acts as the default selector. The value of the entire construct is the value returned by the code that executed.

By the way we're referring to construct instead of statement because it returns a value (like an expression and unlike a statement).

Secondly, Haskell also provides the if-then-else construct, which is just a short-hand for writing:

case <condition> of
  True -> <then-part>
  False -> <else-part>

So you can write that as if <condition> then <then-part> else <else-part>, and that construct returns either the result of the <then-part> or the <else-part> based on the True/False value of <condition>. Again, the if-then-else construct will return a value, it is not a statement.

The third construct is the let <something> in <something else>. Essentially the let construct split logic in multiple assignments, and then all these values are re-assembled in the in part to give a result. Recovering the previous HTML example, we could write:

In [29]:
-- wrap up all the previous code into a new bigger component (function):
bigComponent halfWidth =
  -- first, compute all params
  let
    width = 2 * halfWidth
    urlPostfix = "F6ezdRuWMAAVJfT?format=png&name=900x900"
    helloString = "Hello"
    helloColor = "green-500"
  in
    -- then make the div element and its children:
    H.div H.! A.class_ "text-red-400 bg-gray-700" $ do
      componentA width urlPostfix
      componentB helloString helloColor

-- Invoke the overall html block:
bigComponent 75

This is an example of creating DOM elements similarly to JSX or (better) Elm...

No description has been provided for this image Hello

The main idea with the let ... in ... construct is that it first provides a way to introduce names for a list of terms in a small namespace. In an imperative language these would be consider variable declaration/assignements. Second it provides a way to break down logic in short sequences that can be assembled at will in the in part to provide a result. The implied esthetic of this construct isn't apparent immediately but after using for a little while it becomes dearly missed when using languages that don't provide it.

No description has been provided for this image

WHAT ABOUT THE TYPES?!?

Let's pause for a moment and note that so far there's been barely any explicit use of typing. A significant advantage of the Haskell language is that the typing system is so powerful that you rarely need to bother with type annotation like one would do in C, Java or Rust. The compiler can figure out the logical typing arrangements and run code (idem for the interpreter); or if something is too vague it can alert the developer that the logic isn't sound and describe what isn't making sense.

Take a moment to reflect on this a bit more...

The popularity of Javascript and Python is very much related to the idea that strongly-typed languages are just a major pain in the lower back to deal with. It might be so in Java/TypeScript/Rust/etc. But in Haskell the argument falls apart: one can write simpler code like with Python and Javascript, yet still benefit from all the advantages of stronly-typed logic.

What happens when you have a stronly-typed language that doesn't bother you with typing syntax as in Javascript? Interestingly in real life you find yourself enjoying to add type information, or to let the editor/LSP do it for you! Types end up being a powerful way to document your logic... having a typing system that is subtle rather than shoved down your throat turns out to be a fantastic way to code.

While you're doing some thinking about all of these type-related ideas, I'll borrow some HTML & CSS from the Tailwind site and make something more visually pleasing that what I've been putting together up to now... Haskell language features used here will be discussed further down.

In [30]:
:set -XOverloadedRecordDot

import Control.Monad (mapM_, forM_)
import qualified Text.Blaze.Svg11 as S
import qualified Text.Blaze.Svg11.Attributes as Sa

data Project = Project {
    url :: T.Text
    , title :: T.Text
    , category :: T.Text
    , users :: [ UserPrj ]
  }

data UserPrj = UserPrj {
    name :: T.Text
    , unsplID :: T.Text
  }

projects = [
  Project {
    url = "proj_1"
    , title ="First Project"
    , category = "Testing"
    , users = [
      UserPrj "Joe A" "photo-1531123897727-8f129e1688ce"
      , UserPrj "Jane B" "photo-1494790108377-be9c29b29330"
    ]
  }
  , Project {
    url = "proj_2"
    , title ="Second Project"
    , category = "More testing"
    , users = [
      UserPrj "Achmed A" "photo-1494790108377-be9c29b29330"
      , UserPrj "Simona B" "photo-1506794778202-cad84cf45f1d"
    ]
  }
 ]

sectionD =
  H.section $ do
    H.header H.! A.class_ "bg-white space-y-4 p-4 sm:px-8 sm:py-6 lg:p-4 xl:px-8 xl:py-6" $ do
      H.div H.! A.class_ "flex items-center justify-between" $ do
        H.h2 H.! A.class_ "font-semibold text-slate-900" $ H.toHtml "Projects"
        H.a H.! A.href "javascript:void()" H.! A.class_ "hover:bg-blue-400 group flex items-center rounded-md bg-blue-500 text-white text-sm font-medium pl-2 pr-3 py-2 shadow-sm" $ do
          S.svg S.! Sa.width "20" H.! Sa.height "20" H.! Sa.fill "currentColor" H.! A.class_ "mr-2" $
            S.path S.! Sa.d "M10 5a1 1 0 0 1 1 1v3h3a1 1 0 1 1 0 2h-3v3a1 1 0 1 1-2 0v-3H6a1 1 0 1 1 0-2h3V6a1 1 0 0 1 1-1Z"
          H.toHtml "New"
      form
    projectsD projects

form =
  H.form H.! A.class_ "group relative" $ do
      S.svg S.! Sa.width "20" S.! Sa.height "20" S.! Sa.fill "currentColor" S.! 
            A.class_ "absolute left-3 top-1/2 -mt-2.5 text-slate-400 pointer-events-none group-focus-within:text-blue-500" $
        S.path S.! Sa.fillRule "evenodd" S.! Sa.clipRule "evenodd"
            S.! Sa.d "M8 4a4 4 0 100 8 4 4 0 000-8zM2 8a6 6 0 1110.89 3.476l4.817 4.817a1 1 0 01-1.414 1.414l-4.816-4.816A6 6 0 012 8z"
      H.input H.! A.class_ "focus:ring-2 focus:ring-blue-500 focus:outline-none appearance-none w-full text-sm leading-6 text-slate-900 placeholder-slate-400 rounded-md py-2 pl-10 ring-1 ring-slate-200 shadow-sm"
        H.! A.type_ "text" H.! A.placeholder "Filter projects..."

-- A components for showing projects
projectsD projects =
  H.ul H.! A.class_ "bg-slate-50 p-4 sm:px-8 sm:pt-6 sm:pb-8 lg:p-4 xl:px-8 xl:pt-6 xl:pb-8 grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-1 xl:grid-cols-2 gap-4 text-sm leading-6 dark:bg-slate-900/40 dark:ring-1 dark:ring-white/5" $
    mapM_ aProjectD  projects

aProjectD aProj =
  H.li H.! A.class_ "group cursor-pointer rounded-md p-3 bg-white ring-1 ring-slate-200 shadow-sm hover:bg-blue-500 hover:ring-blue-500 hover:shadow-md dark:bg-slate-700 dark:ring-0 dark:highlight-white/10 dark:hover:bg-blue-500 hidden sm:block lg:hidden xl:block" $
    H.a H.! A.href (H.textValue aProj.url) H.! A.class_ "hover:bg-blue-500 hover:ring-blue-500 hover:shadow-md group rounded-md p-3 bg-white ring-1 ring-slate-200 shadow-sm" $
      H.dl H.! A.class_ "grid sm:block lg:grid xl:block grid-cols-2 grid-rows-2 items-center" $ do
        H.div $ do
          H.dt H.! A.class_ "sr-only" $ H.toHtml "Title"
          H.dd H.! A.class_ "group-hover:text-white font-semibold text-slate-900" $ H.toHtml aProj.title
        H.div $ do
          H.dt H.! A.class_ "sr-only" $ H.toHtml "Category"
          H.dd H.! A.class_ "group-hover:text-blue-200" $ H.toHtml aProj.category
        H.div H.! A.class_ "col-start-2 row-start-1 row-end-3 sm:mt-4 lg:mt-0 xl:mt-4" $ do
          H.dt H.! A.class_ "sr-only" $ H.toHtml "Users"
          H.dd H.! A.class_ "flex justify-end sm:justify-start lg:justify-end xl:justify-start -space-x-1.5" $
            forM_ aProj.users (\u -> H.img H.! A.src (fullUnsplashUrl u.unsplID) H.! A.alt (H.textValue u.name) H.! A.class_ "w-6 h-6 rounded-full bg-slate-100 ring-2 ring-white")
  where
  fullUnsplashUrl anID =
    H.textValue $ "https://images.unsplash.com/" <> anID <> "?auto=format&fit=facearea&facepad=2&w=48&h=48&q=80"

demoPage aReason = 
  H.html $ do
    H.head $ do
      H.link H.! A.href "https://unpkg.com/tailwindcss@^1.0/dist/tailwind.min.css" H.! A.rel "stylesheet"
      H.link H.! A.href "ph_1/xstatic/css/pack_1.css" H.! A.rel "stylesheet"
      H.link H.! A.href "ph_1/xstatic/css/pack_2.css" H.! A.rel "stylesheet"
    H.body H.! A.class_ "dark antialiased text-slate-500 dark:text-slate-400 dark:bg-slate-900" $
      H.div H.! A.style "color: red; background: rgb(115 120 128)" H.! A.class_ "lg:col-span-5 xl:col-span-6 flex flex-col" $
        H.div H.! A.class_ "relative z-10 rounded-xl bg-white shadow-xl ring-1 ring-slate-900/5 overflow-hidden my-auto xl:mt-18 dark:bg-slate-800" $
          H.div H.! A.class_ "container mx-auto p-4" $ do
            H.h1 H.! A.class_ "text-2xl font-bold mb-4" $ H.toHtml aReason
            sectionD

demoPage "Just taking a break"

Just taking a break

Projects

New
  • Title
    First Project
    Category
    Testing
    Users
    Joe A Jane B
  • Title
    Second Project
    Category
    More testing
    Users
    Achmed A Simona B

Ok, not as good as the original, but still really cool to get that kind of result directly into Jupyter's cells. Doing some thinking of my own while looking at the CSS classes in the code, I feel that spending time understanding how Haskell helps me discover so many computational concepts ends up being much more rewarding than spending time on learning stylesheet rules...

Data Structures

The definition of data structures is an essential part of software development. The most basic construct for introducing a new structure in Haskell is the data construct. For example, one would define a new structure to keep track of the width and height of a raster image in the following way:

In [31]:
data ImageSize_1 = ImgSz_1 Int Int

which means that a value of the new structure called ImageSize_1 is the assembly of two values of type Int, and that an ImageSize_1 value is introduced by using the value constructor ImgSz_1.

And if you haven't guessed it already, a new data structure is a new type!

Note that in real life programmers prefer to the same name for the value constructor as for the structure name, as in:

data ImageSize_1 = ImageSize_1 Int Int

but that's just a convention and it can be anything that starts with an upper-case letter.

Why is Haskell using the keyword data for types instead of the more logical keyword type? It seems the aliasing concepts the language designers first early on for the keyword type weren't broad enough, and when the more powerful features came along there was already too many things dependant on the type keyword for a refactoring so the data keyword came about.

To create a value of type ImageSize_1, one would write:

In [32]:
isz_1 = ImgSz_1 200 300

It is valid, but it makes for rather uninformative code. It would be much better if we could immediately identify which of the two numbers is the width and which is the height. To solve this lack of information, we can use the classic Value.field record notation. It's classic but up to 2021 it wasn't part of Haskell at all (!!), and still today it requires the OverloadedRecordDot mode to be activated:

In [33]:
:set -XOverloadedRecordDot

data ImageSize_2 = ImgSz_2 { length :: Int, depth :: Int }

Now we have essentially the same structure as ImageSize_1 but it is better documented. To create a value of type ImageSize_2 with more explicit information, one can write:

In [34]:
isz_2 = ImgSz_2 { length = 200, depth = 300 }

-- the first syntax for is also valid:
isz_2_alt = ImgSz_2 200 300

With the record notation, the code for accessing field values in a data structure is as expected:

In [35]:
if isz_2.length < 200 || isz_2.depth > 500 then "Out of bound" else "No problemo"
"No problemo"

Haskell also uses an old school technique: pattern matching (Prolog anyone?). So to inspect the content of a ImageSize_1 value, we can use pattern matching to do so:

In [36]:
ImgSz_1 nw nh = isz_1
if nw < 200 || nh > 500 then "Kaboom" else "All good"
"All good"

The ImgSz_1 <v1> <v2> syntax on the left-hand side provides a pattern that will match a value of type ImageSize_1. If the right-hand side is of the same type, there is a binding for the <v1> and <v2> terms to the internal components of the value.

While it's old school, it turns out to be an elegant way to write code. Back in the days of expert systems and Prolog, it was a big thing. The syntax turns out to be very readable, especially when dealing with more complex definitions, so it makes it is still in use today.

Ok we can read fields, but how about modifying values of a record structure? Given the immutability of variables, we use a copy-on-write kind of approach:

In [37]:
isz_3 = isz_2 { length = 500 }
"Length: " <> show isz_3.length <> ", depth: " <> show isz_3.depth
"Length: 500, depth: 300"

As we can't modify the isz_2 content, ie isz_2.length = 500 isn't allowed, we create a new ImageSize_2 value with the same depth as isz_2 but with a length of 500.

The newcomer's reaction to this approach of creating new values instead of mutating existing values is often that Haskell code is wasting a lot of memory. But as the underlying mechanism for the value creation use copy-on-write, Haskell immutability constraint effect on memory consumption is the essentially the same as with other garbage-collected languages like Javascript, Python or Java. The real impact of this approach is that more variable names that may appear in the code as structure values are modified over the length of a function, and that is a good thing from a code clarity perspective. Computer memory isn't wasted, and in fact given the overall set of features in Haskell the memory consumption can end up to be equivalent to a C or Rust program.

The kind of structure we have defined with ImageSize_<x> is a product type. By contrast, a sum type represents a choice amongst alternatives provided in the structure definition. That's known as an union in C, enum in Rust, etc.

A sum type is defined by putting a list of type definitions seperated by the | symbol, ie giving the alternatives (think of the | symbol as a or keyword):

In [38]:
data Sizing =
  Image ImageSize_2            -- expressing the size for an image based on previous definition
  | Video Int Int Int Double   -- expressing the size based on some video concepts (width, height, duration, resolution)
  | Unknown String             -- an escape route to express some other size parameters.

Then we can create some values of type Sizing by writing:

In [39]:
-- Sizing for an image:
imgSz = Image isz_2
-- Sizing for a video:
vidSz = Video 1920 1080 32 29.9
-- Sizing somethign we don't know about:
unkSz = Unknown "1,2,3 and 4,5,6, and let's say big, big"

It is now a good time to revisit the case construct and see how it's used with sum types:

In [40]:
someSize = vidSz

case someSize of
  Image imgSize -> "length: " <> show imgSize.length <> ", depth: " <> show imgSize.depth
  Video w h _ _ -> "width: " <> show w <> ", height: " <> show h
  Unknown _ -> "Who knows!"
"width: 1920, height: 1080"

This code shows both the pattern matching and sum types in action to give readable code (but not necessarely clever or useful code...).

The combination of pattern matching with function definition is another way to provide readable code. In the following example, the function f1 is defined using pattern matching. The compiler is in charge of collecting all definitions of f1 and then verifying the argument at runtime to launch the execution of the right version of the function. This looks like:

In [41]:
-- create a synonym of the string conversion combined function for more readable code later on:
useString = H.toMarkup . T.pack

-- define a function that works with Image values:
f1 (Image imgSize) = do
    H.b "IMAGE"
    H.ul $ do
      H.li $ useString $ "length: " <> show imgSize.length
      H.li $ useString $ "depth: " <> show imgSize.depth

-- define a function that works with Video values:
f1 (Video w h _ _) = do
    H.b "VIDEO"
    H.ul $ do
      H.li $ useString $ "width: " <> show w
      H.li $ useString $ "height: " <> show h

-- define a function that works with Unknown values:
f1 (Unknown _) = H.b "Who knows!"

-- apply the f1 function to a value of type Sizing:
f1 vidSz
VIDEO
  • width: 1920
  • height: 1080

Note that this code uses

H.li $ useString $ "length: " <> show imgSize.length

instead of

H.li useString "length: " <> show imgSize.length

Why is that? We don't bother with parenthesis or commas to delimit the parameters of a function in Haskell. But in return, there's a basic rule to disambiguate which parameters are part of which function call, and that is left-to-right precedence.

If we used the second version of the code instead of the first one, the left-to-right precedence rule means that the H.li function, which takes a single parameter, will use the useString function and thus it doesn't compute! So we must tell the compiler that the right parameter to use is the result of the useString application to the rest of the line, and that's what the first $ symbol indicates. Then we have a similar issue with "length: " not being the right parameter for useString, so we put another $ to group the rest of the line into a string value. But then why isn't there a $ to group the show imgSize.length function call? That's because the left-to-right rule has level of precedence for each function, and it turns out that <> precendence is lower than normal. So unlike the two previous situation, the show imgSize.length will be executed first, and that will become the 2nd parameter of the <> function call, "length: " being the first parameter.

Looking at lists

We have so far covered a lot of ground, but we still have managed to stay away from a very important structure of programming languages: the list! Lists are especially important in functional languages as they normally constitute the basis for iteration. It is an essence of functional programming to use the scan and accumulate concept as a better way of doing of what other languages implement with the for-loop or do-while statements, and a central point of debate for the functional vs non-functional discussions.

The very first thing about a list in Haskell is that it's written with the same syntax that most other popular languages use for arrays:

In [42]:
aList = [ 1, 2, 3, 4, 5 ]

But that's not an array, it's a list. As in "think Lisp", ie singly linked list... That means there's no index to get elements and instead you have to use the head and tail or car : cdr technique. Haskell uses the : function to put a new element at the head of an existing list, as in:

In [43]:
6 : aList
[6,1,2,3,4,5]

And conversely, pattern matching with : is used to extract the first element of a list:

In [44]:
firstElement : restOfList = aList
firstElement
1

Adding more entries at the end of a list is a concatenation, and that's done with the <> function as seen before.

In [45]:
aList <> [ 6 ]
"allo" <> " byebye"
[1,2,3,4,5,6]
"allo byebye"

Developers that don't have a formal computer science will find it weird to use lists for all kind of iterative and recursive flow control, but hopefully those with the formal background will remember that they used that approach in all kind of algorithms instead of the lowly index scans.

Note that the String type is defined simply as a synonym for the list of Char type, or again [Char]. There is an important consequence to this definition: the basic string in Haskell is a singly linked list of Unicode characters! That is not necessarely the most efficient string implementation there is, and as we'll see soon it is a popular hobby amongst Haskell programmers to use alternative types for optimal string operations.

The typical iterations and recursions on list are done with either the map and fold family of functions, or using case construct with pattern matching on the car : cdr values. For example:

In [46]:
Prelude.map (H.b . useString . show) aList      -- apply a function to each element of aList

Prelude.foldl (*) 2 aList   -- multiply recursively each value of the list and double that.
1
2
3
4
5
240

Basic stuff...

For a bit more advanced list processing:

In [47]:
showOff inList = H.ul $ fRecur inList

-- Call itself recursively, but operate as an iterator over a list of values.
fRecur inList =
  case inList of
    [] -> H.b "-- 0 --"        -- Edge case, the list is empty.
    [car] -> itemize car       -- The list is a single element.
    car : cdr -> itemize car <> fRecur cdr       -- There are more than 1 element in the list, call itself to iterate over all elements.
  where
    itemize = H.li . useString . show

showOff aList

fRecur []
  • 1
  • 2
  • 3
  • 4
  • 5
-- 0 --

This example uses recursive accumulation to get things done. It also introduces the where construct that lets us provide a set of definitions after the main logic rather than before. So you can consider let and where as mirror of each other. Using let or where makes no difference to code execution sequencing, it's just providing flexibility for writing helper code either before or after the main logic.

Also in this example there's a function definition instead of a variable definition in the where construct. At the end of the day, variables are just functions with 0 argument, so defining functions or variables is exactly the same thing, both in the let and where syntax.

We have now covered a good base of Haskell syntax and functional concepts. And beside the section where we explicitely wanted to talk about typing, still no typing to bother with! 😀

Fetching web data

We'll now start to look at more relevant topics to web application development, starting with using HTTP, the basic communication protocol of the web.

To build an example we'll use a simple REST api provider, the USA's Car Vehicle API, to provide a live endpoint to work with. The API provides us with some information about vehicles that are used in the USA, embedded in JSON format.

The basic HTTP tool of Haskell, the Network.HTTP.Client package, to fetch data easily. We'll then use the popular Aeson package to upgrade automagically the JSON into Haskell data structures.

Also to illustrate a point about monadic stuff, we'll put the logic in a toy function.

In [48]:
import qualified Network.HTTP.Client as N
import qualified Network.HTTP.Client.TLS as N

vehicleFetch query =
  let
    url = "GET https://vpic.nhtsa.dot.gov/api/vehicles/" <> query
    request = N.setQueryString [("format", Just "json")] $ N.parseRequest_ url
    settings = N.tlsManagerSettings
  in do
    reqMgr <- N.newManager settings
    N.httpLbs request reqMgr
 
response <- vehicleFetch "getallmakes"

The function is split in two parts using a let construct. The first part specifies the values for url, req and settings terms.

The HTTP fetch happens in the second part, ie the in. Now we know that fetching data on the web is far from being a pure operation, ie consistently repeatable. In the Haskell world, that means the computations need to be wrapped in one of those famous monads, in this case the IO monad which contains the impure state of the real world. There is no indication that the IO type is being used in the code as it is all deduced by the compiler. But the two impure computations need to be performed in a strict context and order, and that's why the do keyword shows up just after in.

Once results are generated within the IO monad, we'll want to get them back into the pure world. The <- (left-pointing arrow) symbol does exactly that, extracting the values from the IO impure context and making them available for the pure computations.

The first impure operation is the definition of a HTTP communication manager. Having to specify that is the price to pay for using a low-level package which exposes the connection layer.

The next operation is the actual fetching of data, achieved by the httpLbs function. This being the last value encountered in the vehicleFetch function, it also become its result. That means that vehicleFetch returns values embedded in the IO monad.

If we were to specify typing for that function, it would gain:

vehicleFetch :: String -> IO (N.Response Data.ByteString)

meaning that the function takes a String parameter, and returns a ByteString-based HTTP Response wrapped in the IO type (ByteString is yet another string implementation!).

But why bother with type annotation?

Once the data is received from the web after invoking the vehicleFetch, the <- symbol is used again to extract it out of the IO monad and back in the pure world. If the Internet is working properly, the response value you get in the cell is a 200 HTTP reply with a body full of JSON data. I'll assume your Internet is fine, and first take a peek at the status and first 150 characters of the body, just to have an idea of what's inside.

In [49]:
import qualified Data.ByteString.Lazy as LBS
import qualified Data.Text.Lazy.Encoding as TL
import qualified Data.Text.Lazy as TL

print $ "status: " <> show response.responseStatus <> ", body: " <> (Prelude.take 150 . TL.unpack . TL.decodeUtf8 $ response.responseBody)
"status: Status {statusCode = 200, statusMessage = \"OK\"}, body: {\"Count\":11316,\"Message\":\"Response returned successfully\",\"SearchCriteria\":null,\"Results\":[{\"Make_ID\":12858,\"Make_Name\":\"#1 ALPINE CUSTOMS\"},{\"Make_ID"

That example should show you a 200 statusCode, and then a body where you can see the beginning of a long JSON payload of strings-in-a-string full of escaped sequences ('' is the escape character).

The example also shows the string conversion gymnastic that is typical of Haskell programming. There are many ways to encode string data (for scanning, for compactness, for high-speed string manipulations, etc). Given Haskell extensive level of precision available to manage strings, it becomes a habit to convert strings between the optimal implementations required by different utilisations. In our example, 3 typical string packages are imported and then the unpack and decodeUtf8 functions get the job done.

Assuming the HTTP response is indeed a 200, then we can move to upgrading the JSON string to Haskell data structures.

Using JSON and meta-programming

The conversion between JSON and Haskell values is normally done automagically. Nonetheless we'll look at how the magic happens, since it is an introduction to meta-programming features of Haskell.

The basic setup to manipulate code with code during compilation is the use of the GHC.Generics core Haskell package together with the DeriveGeneric and DeriveAnyClass modes.

Golang and other recent language compilers provide some features to transform a JSON string into a struct and vice-versa, something that is achieved through support logic that is fairly hidden within the compiler itself.

In Haskell the meta-programming system is used so that one can extend or modify at will the behavior of the transformation. It is a blessing for automatizing the generation of boiler-plate code and doing all kind of further code generation from a few lines provided by the developer.

The Aeson package comes with the necessary logic to introspect a data structure, ie a type, and automatically generate at compile-time the additional logic to parse JSON strings into values of that given type, and/or generate JSON representations of values of that type.

For example:

In [50]:
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DeriveAnyClass #-}

import GHC.Generics
import qualified Data.Aeson as J

data SimpleThing = ST {
    aField :: String
    , anotherField :: Int
  }
  deriving (Show, Generic, J.FromJSON, J.ToJSON)

The SimpleThing type defined here is a simple record with a String and an Int fields, there's nothing special about that. The interesting part is the deriving keyword that comes after the data structure definition, together with the tuple of type names (remember, upper case first letter is for types and modules). That derivation command triggers the Haskell compiler to launch meta-programming handling and effectively generate all the required logic to convert a value into a string representation (Show), to create an introspective representation of the type (Generic), and finally to create conversion logic to go back and forth between a value of SimpleThing and its JSON representation (FromJSON, ToJSON).

None of that generated code shows up, but the compiler will remember it's there and will use it whenever required.

Note that Show, Generic and etc aren't really types, they are typeclasses... we'll talk about the difference in a few cells.

Now let's create a value of SimpleThing type, then show it as a string and go between Haskell internal representation and JSON. All that is standard operations in web applications:

In [51]:
simpleThing = ST { aField = "first value", anotherField = 100 }

-- Use the Show code to convert the value to a string.
print simpleThing

-- Use the ToJSON code to get a JSON string out of the value.
J.encode simpleThing

-- Parse the JSON string to get a SimpleThing value.
simpleJson = "{\"aField\":\"first value\",\"anotherField\":100}"
show (J.decode simpleJson :: Maybe SimpleThing)
ST {aField = "first value", anotherField = 100}
"{\"aField\":\"first value\",\"anotherField\":100}"
"Just (ST {aField = \"first value\", anotherField = 100})"

Having derived SimpleThing with ToJSON, the encode function will apply the Aeson conversion logic and convert a SimpleThing value to a JSON string as we'd expect.

Then the decode simpleJson function is asked to be of type Maybe SimpleThing. That instructs the compiler to apply the Aeson conversion logic to go from a JSON string to a SimpleThing value.

This is the first time we encounter the Maybe type, a pillar of Haskell programming. It is the standard way to manage optional results, similarly to the ? symbol in Rust or TypeScript. Most importantly Maybe does away with the usual null or undefined issues of so many other programming languages. It is an additive type, which uses two value constructors: Nothing and Just <a-value>. So when a function can either emit a result or return a void, we'll define it as a Maybe <some-type> function.

In this example the decode function does its magic and returns a Just SimpleThing value, which is then shown as a string. If the decode function couldn't figure out how to parse the JSON, it would simply return a Nothing. If you worry about error management, decodeEither is the answer but I won't get into that now.

The way Aeson meta-programming logic can be expanded for extension/modification is especially important given the way AI code generation is coming about. Unfortunately meta-programming is a huge topic and we need to keep moving, so we'll cover very little of that amazing Haskell feature.

Going back to the vehicleFetch example, we can now define the ResponseVehicle and CarDescription data structures to hold the content returned in JSON format.

When using 3rd party APIs in real life, it's not always neat and simple to parse the JSON fields. In the case of the Vehicle API, the top-level JSON object fields start with an uppercase letter, and then the individual car description object has a 'Make_' prefix for its fields that's redundant. On the other hand, Haskell fields in a record must start with a lowercase letter, and we don't want to have to carry the 'make_' prefix in our code.

To deal with the structure impedence, we'll inject a few modifications to the default meta-programming logic that deals with the conversion of JSON representations, using the instance keyword instead of deriving one to specify our own FromJSON implementation. And yes, instance means instance of a class! As mentioned previously, Haskell as a typeclass concept. It isn't object-oriented programming but it does enable to group common features into classes and then assign them to types through derivation. Further more these features can be extended as required for meta-programming purposes through instantiation specialization. It's yet another very interesting aspect of Haskell that we don't have time to dig into!

Here's the code to represent the vehicle info in Haskell as a ResponseVehicle and CarDescription types, and parse the respective JSON representations:

In [52]:
import qualified Data.Char as C

-- Specify the structures for holding the API overall reply and each car specification.
data ResponseVehicle = ResponseVehicle {
    count :: Int
    , message :: String
    , searchCriteria :: Maybe String
    , results :: [ CarDescription ]
  }
  deriving (Show, Generic)

data CarDescription = CarDescription {
    uid :: Int
    , name :: T.Text
  }
  deriving (Show, Generic)


-- Extend the JSON-conversion logic of Aeson with specific logic for the Vehicle API format:
instance J.FromJSON CarDescription where
  parseJSON (J.Object obj) = CarDescription <$> obj J..: "Make_ID" <*> obj J..: "Make_Name"

instance J.FromJSON ResponseVehicle where
    parseJSON = J.genericParseJSON J.defaultOptions { J.fieldLabelModifier = firstUpper }

-- Define a 'make first letter upper-case' helper:
firstUpper (c:cs) = C.toUpper c : cs
firstUpper [] = []

The parseJSON function of the FromJSON typeclass in the Aeson package is the entry point for the basic conversion from JSON to a Haskell value. In the code above, the instance J.FromJSON CarDescription where means that we'll add to the existing meta-programming logic of Aeson's FromJSON typeclass. We then provide a new definition for parseJSON that builds a CarDescription value from the Make_ID and Make_Name fields found in the JSON data.

The somewhat cryptic <$> and <*> symbols are simply helper functions used to apply other functions within the containers the JSON values are stored in. They are worth looking into a bit more.

Let's use these helpers for doing math with values that are within the Maybe container. Let's say we have some numbers from a function that may return nothing, so they are contained by a Just:

In [53]:
(+) <$> Just 2 <*> Just 5
Just 7

The <$> push the + function into the first parameter, and the <*> connects it into the second one. That means we can add the integers and then put the sum back into the same kind of Just container. The neat thing is that those helpers follow the rules of the container, so in the case of:

In [54]:
(+) <$> Just 4 <*> Nothing
Nothing

we see that the result isn't a crash but rather a reasonable Nothing value. That's because the rules of the Maybe type are enforced by <$> and <*>, leading to the + function behaving correctly even with a Nothing value although it doesn't know anything about the Maybe container.

Note: the weird J..: syntax is Aeson's field value extractor function .: prefixed with the J package. As noted before, there are just few syntax rules in Haskell, so exotic function names are frequent, such as .:, or ! for Blaze. The decision to use the package prefix everywhere in the code of this notebook is resulting in some weird looking code...

Going back to the parseJSON function extended for the ResponseVehicle data structure, we use a different strategy. Instead of providing new extraction logic in the extension we simply modify the configuration parameters of Aeson's genericParseJSON function: the fieldLabelModifier of the defaultOptions value is updated with a function that maps the first character of each field name to an uppercase value. That's it.

Getting the JSON data returned by the Vehicle API to Haskell values is then a simple J.decode call; the on-the-fly generated logic takes care of field mapping, sub-objects resolution, array parsing, etc. Further more, if you were to look at the generated code (it's possible to ask the compiler to spit it out), you'd see it is fast and efficient because it's fine-tuned for each kind of JSON conversion required. For example we used the Text type for the name field in the CarDescription, which is a more efficient string implementation than String, and thus the generated JSON parsing logic for that perticular kind of string will also be used in the CarDescription JSON translation logic. Compare that to for example Go JSON parsing of complex structures with conditionals and recursion...

In the next block of code, we bring back Blaze to get HTML beautification for the values converted from JSON and to show how we typically deal with Maybe situations. Why do print when it's simple and so much nicer preview values of complex types in HTML?

In [55]:
showCar aDescription = do
  H.b . useString . T.unpack $ aDescription.name
  H.i . useString $ " (id: " <> show aDescription.uid <> ")"

mbApiResponse = J.decode response.responseBody :: Maybe ResponseVehicle

H.div $ case mbApiResponse of
  Nothing -> H.b "No result."
  Just result ->
    let
      count = result.count
    in
      if Prelude.length result.results == 0
        then H.b "Nothing???"
        else do
          H.b . useString $ "Got " <> show count <> " descriptions, the first few ones are:"
          H.br
          H.ul $ mapM_ (H.li . showCar) (Prelude.take 5 result.results)
Got 11316 descriptions, the first few ones are:
  • #1 ALPINE CUSTOMS (id: 12858)
  • 1/OFF KUSTOMS, LLC (id: 4877)
  • 102 IRONWORKS, INC. (id: 11257)
  • 12832429 CANADA INC. (id: 12255)
  • 17 CREEK ENTERPRISES (id: 6387)

At this point you should be able to read the code easily. The only new item is the mapM_ function, which is an iterator specialized for Monads values.

As you can see, server-side HTML content generation from scratch is simple. But for many people, it's even simpler to work out of text templates. Let's do that too!

Using Jinja2 HTML templates

In the Haskell ecosystem, the Ginger package is the go-to for using Jinja2 templates.

In the following example, the DotValue structure will represent our app-specific information. We extend the Ginger logic using meta-programming on the DotValue that will later be applied in the template runtime context. Then a text file is loaded (it's the template!), it is parsed and finally the Ginger VM runs the template logic with the data context to produce HTML.

The first test template is using Tailwind CSS to add styling to the Jupyter notebook bland look, and it's also using the D3.js library for implementing data visualization. That has nothing to do with Haskell but at this point using markdown is boring.

First let's define DotValue, which will be a sum type that holds 2 simple data structures, for string and circle, and two container values, for list and dictionary. You can see here the common functional habit of using self-referencing definitions:

In [56]:
import Control.Monad.Identity (runIdentity)
import qualified Data.Map as Mp
import qualified Text.Ginger as Gi
import qualified Text.Ginger.Html as Gi

data DotValue =
  StrV T.Text
  | CircleV Int Int Int String  -- quick and dirty way to provide x,y pos, radius and color.
  | DictV (Mp.Map T.Text DotValue)
  | ListV [ DotValue ]

Then we provide the logic for transforming a DotValue value into a Ginger's GVal. For the first 3 of the 4 kind of values, Ginger default converter will do the remap automatically. The CircleV conversion requires some work, as each of the record's fields needs to be mapped into a named value. That additional informatin will tell the Javascript code in the template how to pass along the right data to the D3.js functions for getting SVG diagrams.

In [57]:
-- Enable the use of the "\case" shortcut (f x = case x of ... => f = \case ...):
{-# LANGUAGE LambdaCase #-}

instance Gi.ToGVal m DotValue where
  toGVal = \case
    StrV aString -> Gi.toGVal aString
    DictV aMap -> Gi.toGVal aMap
    ListV aList -> Gi.toGVal aList
    CircleV cX cY radius color -> Gi.toGVal $
            Mp.fromList [
              ("cx", show cX)  :: (T.Text, String)
              , ("cy", show cY)
              , ("radius", show radius)
              , ("color", color)
            ]

The Map dictionary is initialized from a list of pairs (2-tuples).

WARNING: TYPES INVOLVED! The Map requires that we specify the key and value types given there are many valid choices. We provide that information for the first value of the list and that is enough to get all the rest of the typing figured out.

In this example we use a common Haskell structure, the tuple (<value>, <value>). Tuples are standard in most modern languages, and I expect they work just like you think they should.

We then assemble some mock data and store it all as the demoData_1 term. Although we're dealing with complex typing possibilities, the compiler is still able to figure out what is what and we have no typing details for any function to provide.

Also let's recycle the data from the Vehicle API used earlier on if there's some available.

In [58]:
demoNavList = case mbApiResponse of
  Nothing ->
    let
      nav_1 = Mp.fromList [
         ("url", StrV "https://gaga.com/url_1")
       , ("label", StrV "Gaga")
       ]
      nav_2 = Mp.fromList [
          ("url", StrV "https://gougou.com/url_1")
       , ("label", StrV "Gougou")
       ]
    in 
    ListV [ DictV nav_1, DictV nav_2 ]
  Just apiResponse ->
    let
      subList = Prelude.take 5 apiResponse.results
    in
      ListV $ Prelude.map carDescToTuple subList
  where
    carDescToTuple cd =
        DictV $ Mp.fromList [
            ("url", StrV . T.pack $ "http://https://vpic.nhtsa.dot.gov/api/vehicles/getmake?id=" <> show cd.uid)
            , ("label", StrV cd.name)
        ]

circles = [
    CircleV 20 90 40 "green"  
    , CircleV 50 90 45 "red"  
    , CircleV 85 90 50 "blue"  
  ]

demoData_1 = Mp.fromList [
    ("title", StrV "Haskell, Hidden Gem")
    , ("navigation", demoNavList)
    , ("circles", ListV circles)
  ]

As usual, no need to bother with type annotations. Yet it's worth looking under the hood and find out what was inferred for the demoData_1. That's done with the :t feature of the notebook, as in:

In [59]:
:t demoData_1
demoData_1 :: forall {k}. (Ord k, IsString k) => Map k DotValue

The type definition provided by the compiler shows there are a Map of DotValue values involved (that's the dictionary and its values). Interestingly for the keys the only possible inference is on polymorphic constraints: Ord k means k can be ordered, and IsString k means k must be able to be dealth with as a String. Even if the inference is partial, it's enough to move forward. Furter use of that demoData_1 function will resolve what kind of string variation the keys are using.

While it may be convenient to let the compiler do all the type management, it is a good habit to lock in some of the types, so a more prudent approach of writing the demoData definition would be to specify:

demoData_1 :: Mp.Map T.Text DotValue

if we want to make sure that the Text implementation of strings is used.

The readFile function takes care of loading data from a file, that's basic stuff. But as this is an impure operation, we use the <- symbol to extract the result back into the pure context. If the file doesn't exist, the readFile function will throw an exception. Yes, the Haskell runtime uses exceptions...

In [60]:
template_1 <- Prelude.readFile "ph_1/xstatic/templates/templ_1.html"

The handling of included templates and errors that can show up during the parsing of the template text just loaded up requires to provide a few things when calling the `parseGinger' function:

In [61]:
loadTemplate aTemplate = do
  either (error . show) id . runIdentity $
    Gi.parseGinger (const $ return Nothing) Nothing aTemplate

The first parameter of parseGinger is a file resolver to handle additional templates included in the first one; it is a function that will receive a path and figure out how to fetch additional content. In our case, we pass a do-nothing function, const $ return Nothing, so no additional templates can be included.

Note: this being Haskell, return is not a keyword, but a function! It does the opposite of the <- extractor used in monadic context, embedding a naked value into a monadic container. In the case of the IO monad, it makes the Nothing value become the equivalent of an IO operation that didn't produce anything.

The Ginger VM needs a way to dereference values during the template execution. We use the lookup function to do that: it simply gets the template variable name as a parameter called needle and finds that in the demoData dictionary.

Then the logic provided in the instance of ToGVal for DotValue will do the rest of the work in getting the data into the constructed HTML.

In [62]:
derefLabel needle =
  Gi.toGVal $ Mp.lookup needle demoData_1

tmplDemo =
  Gi.runGinger (Gi.makeContextHtml derefLabel) (loadTemplate template_1)

Gi.htmlSource tmplDemo
Haskell, Hidden Gem

Haskell, Hidden Gem

Name Destination
#1 ALPINE CUSTOMS #1 ALPINE CUSTOMS
1/OFF KUSTOMS, LLC 1/OFF KUSTOMS, LLC
102 IRONWORKS, INC. 102 IRONWORKS, INC.
12832429 CANADA INC. 12832429 CANADA INC.
17 CREEK ENTERPRISES 17 CREEK ENTERPRISES

The htmlSource function simply converts the Ginger internal representation of HTML produced by the runGinger function into the one required by the notebook.

You now know how to deal with JSON, fetch data from the web, create HTML content with either html-like code or Jinja2 templates.

Before moving forward, have a quick reflection on the template-based HTML generation vs going through interpreted Blaze logic. When doing quick iterations on developing a good looking HTML page, I now find myself to prefer the later... In fact, using Elm is my preferred way to generate HTML.

What else does a full-stack web app developer continuously work with in daily tasks? Databases and HTTP endpoint servicing!

Domain Specific Language and DB access

The Domain Specific Language (DSL) capability of Haskell is a very cool feature that is somewhat at odds with it. You have a language that is promoting a simple syntax and a functional approach... and then boom! with DSL the door opens up to pretty much any syntax and computing model you can come up with.

DSLs syntax are brought in by doing the relevant package import and then using the QuasiQuotes blocks to switch between Haskell and a DSL syntax. This also requires to enable the QuasiQuotes mode with:

In [63]:
{-# LANGUAGE QuasiQuotes #-}

We'll use some database interactions to showcase the benefits of DSLs. No matter how many ORM packages people come up with in their favorite ecosystem, they'll eventually conclude that SQL is the way to go for manipulations on anything bigger than toy datasets. Normally that means you have to write SQL statements as string constants with a bunch of question marks for parameter holes, invoke some execution functions, map out the results, etc. It's error prone, it's very error prone, it's incredibly error prone and actually awkward. If only you could write SQL as a first-class citizen, merged into the rest of the logic you can express with your favorite syntax.

The solution exists, it is embedded SQL: the ability to use your preferred language together with SQL statements that will be analyzed, syntax-verified, type-impedence resolved, and finally that will be executed transparently at runtime. And if you have embedded SQL, you also want that in the latest version of the compiling toolchain at your disposition (when was the last version of the Pro*C compiler released?).

This is exactly what we can achieve with a SQL DSL in the GHC compiler, which means Haskell instantly becomes your favorite programming language.

The Hasql-th package provides the SQL syntax as a DSL as part of the excellent Hasql family of facilities for Postgresql operations. In other words, you get a great support for embedded SQL in Haskell code.

Rather than doing function calls that pass SQL statements as strings, we can instead do SQL queries that are syntax- and type-checked at compile time. So much for code breaks at runtime because you wrote slect a from ... instead of select a from ....

To get started, a bunch of packages are brought into the namespace.

In [64]:
import Data.ByteString

import qualified Hasql.Connection as DbC
import qualified Hasql.Session as Se
import Hasql.TH as Th

import qualified Data.Vector as Vc

To connect to the DB, credentials should be to get them from outside the notebook, eg from the Jupyter process' environment. If there's a DBINFO env var that contains the connection params separated by ':' symbol, it would be a matter of spliting that var value into 4 entries and then use that to define the dbSettings, as in:

In [65]:
import qualified System.Environment as SE
import qualified Data.Text.Encoding as T

dbInfo <- SE.getEnv "DBINFO"

[ host, db, user, paswd] = Prelude.map T.encodeUtf8 $ T.splitOn ":" (T.pack dbInfo)

Note: the expectation here is that the env variable is made of a set of colon-separated values, it is loaded as a String, converted into Text, passed to the splitOn function (String has no such feature) which returns an array of values, each value is converted to a ByteString with the encodeUtf8, and finally stored in specific variables. Always fun to do some string gymnastics.

But as it's not clear how the Jupyter runtime for this notebook is spawned and given we use a free-tier demo db from SupaBase, hiding the connection information isn't even worth it:

In [66]:
dbSettings = DbC.settings "aws-0-eu-central-1.pooler.supabase.com" 5432 "demoa.vtjcwbccdutdqcsxdaqr" "justForDemo" "demoha"

Our demo database provides basic information about the top starred and forked GitHub repos for the last 5 years. The source information is at GitHub Top 100; I've simply parsed and normalized the CSV files into a more efficient SQL representation.

The GhProject structure provides a way to hold the de-normalized data within the interpreter runtime context. The example SQL code simply does a select/join over all normalized tables to fetch the data in the de-normalized format straight off the DB rather than having to performi that work in the Haskell side. In real life, you'll consider that such a de-normalization puts the workload on the RDBMS and the network bandwidth, and that it might be more efficient to fetch the data normalized to minimze RDBMS resources and network bandwidth utilization, and de-normalize in the app runtime context instead.

In [67]:
import qualified Data.Time.LocalTime as Lt
import qualified Data.Int as DI

data GhProject = GhProject {
    rank :: DI.Int32
    , section :: T.Text
    , repo :: T.Text
    , stars :: DI.Int32
    , forks :: DI.Int32
    , language :: Maybe T.Text
    , url :: Maybe T.Text
    , uname :: T.Text
    , issues :: DI.Int32
    , lastCommit :: Lt.LocalTime
    , description :: Maybe T.Text
  }

data TopGithubCtxt = TopGithubCtxt {
    headColumns :: [ T.Text ]
    , rows :: GhProject
  }

Now we're getting to the real stuff: write a SQL statement to extract rows from the DB.

Note: the syntax switch between Haskell and SQL is done using the [<context>| <statements ] QuasiQuoter block.

The Hasql-th package provides a [Th.vectorStatement|... |] block to indicate that the SQL statement will return a vector of values. In that block, the query is written in almost standard SQL; that code will go through a proper SQL parser to produce internally the logic that will run in the DB engine. Additionally the compiler will create glue code to pass values to the query from the Haskell world, and convert the raw data returned by the DB into a Vector of Haskell values. Being able to use the compiler to verify the SQL increases tremendously the reliability of the data manipulation code at runtime and eliminates a lot of testing requirements.

To retrieve a bunch of rows from our GitHub dataset, we define the simpleQuery function with embedded SQL:

In [68]:
import qualified Data.Profunctor as Pf
import qualified Data.Tuple.Curry as Tc

simpleQuery repoName =
  Se.statement (repoName) $ Pf.dimap matcherFromString toGhProject
    [Th.vectorStatement|
        select
            o.rank::int4, ta.label::text, tb.label::text, o.stars::int4, o.forks::int4
            , tc.label::text?, td.label::text?
            , te.label::text, o.issues::int4, o.last_commit::timestamp, o.description::text?
        from GithubTop100 o
            left join itemu ta on o.itemfk = ta.id
            left join RepoNameU tb on o.reponamefk = tb.id
            left join LanguageU tc on o.languagefk = tc.id
            left join UrlU td on o.urlfk = td.id
            left join UserU te on o.unamefk = te.id
        where tb.label ilike $1::text
        order by date_trunc('day', o.last_commit) desc
            , itemfk, rank
        limit 10
    |]
  where
  matcherFromString n = "%" <> n <> "%"
  toGhProject = fmap (Tc.uncurryN GhProject)

By almost standard SQL, I meant that the standard SQL syntax is extended with some type-annotations so that fields in the select statement are typed to both check consistency and specify SQL-to-Haskell value conversion.

The first parameter to the statement function specifies the values to use as dynamic entries in the SQL statement. In this case we only pass the repoName variable to limit the data fetch to a single repository. The 2nd parameter to the statement function can be either the QuasiQuoter block by itself or, as in the case here, conversion logic for dynamic entries and returned values that can be applied with a dimap association. Detailing the ProFunctor package where dimap is defined goes beyond the scope of this notebook, but you can figure out that the matchterFromString is an encoder function which simply adds SQL wildcards around the repo name, and a decoder function toGhProject which simply converts raw tuples of values returned by the select SQL statement into GhProject values through the application of the uncurryN helper (uncurryN aFct (a1, a2, a3) = aFct a1 a2 a3).

Parameters passed to the SQL block are referred to using their position. The $1::text notation that is used in the where clause means the repoName wildcard-wrapped by the matcherFromString encoder. The implementator of Hasql-TH preferred to use positional referencing, but with Haskell DSL it is also possible to use named params in QuasiQuoter blocks. In fact the Hasql-Interpolate extension package to Hasql will let you do just that and use #{repoName} notation instead of $1::<type>. The Haskell-to-SQL conversion is in this case automatic, but it will normally use meta-programming modifications when SQL or Haskell values are complex.

Given we have a simpleQuery function with embedded SQL, the next thing is to execute it within a DB session to get a Vector of GhProject values (or an error).

Enough blah-blah, let's get a session to the DB server:

In [69]:
dbConn <- DbC.acquire dbSettings

Going out over a socket to a DB server somewhere on the Internet is definitely not pure, so we use the <- to bring back the result from the acquire call in pure form.

The next step is to execute the query, and then continue based on whether a correct or erronous result was sent by the DB engine.

It's essentially now just a matter of invoking Hasql's run function. But we live in an imperfect world so we'll wrap the run call with proper error management that will catch any anomaly. It's always a very good approach to put a maximum of runtime checking around distributed data operations:

In [70]:
dbVal <- case dbConn of
  Right conn -> do
    rezA <- Se.run (simpleQuery "bootcamp") conn
    case rezA of
      Left err -> pure . Left $ show err
      Right vcVal -> pure $ Right vcVal
  Left err -> pure . Left $ show err

Once that cell has run, either the DB server executed the SQL statement and we've got back a vector of values, or there has been a runtime error. The no-error/error situation is indicated with the Either monadic type, which uses Right and Left as containers meaning no-error and error respectively, as you can find in Rust Result or C++ std::expected.

If we got a good result from the query, we'll format it using HTML. It's much nicer to look at than a bland print... Using the Jinja template approach, we'll build a GVal instance of GhProject as a dictionnary of fields that will be used to populate an HTML template. Using Generics can do the conversion just by derivation, but hopefully you find it more informative to see how a bunch of fields are mapped to the Ginger VM context.

The following code loads a template file, defines the GhProject mapping to GVal, adds some formatting for erroneous situations, and finally runs the template engine with the dataset received from the DB so the data is shown in a HTML format (be patient, the DB query may take a while to complete):

In [71]:
import qualified Data.Maybe as Mb

template_2 <- Prelude.readFile "ph_1/xstatic/templates/templ_5.html"


instance Gi.ToGVal m GhProject where
  toGVal p = Gi.toGVal $ Mp.fromList [
        ("rank", T.pack $ show p.rank) :: (T.Text, T.Text), ("section", p.section), ("repo", p.repo), ("stars", T.pack $ show p.stars)
        , ("forks", T.pack $ show p.forks), ("language", Mb.fromMaybe "<nil>" p.language), ("url", Mb.fromMaybe "<nil>" p.url)
        , ("uname", p.uname), ("issues", T.pack $ show p.issues), ("last_commit", T.pack $ show p.lastCommit)
        , ("description", Mb.fromMaybe "<nil>" p.description)
    ]

derefGhProj context needle =
  case needle of
    "columns" ->
      Gi.toGVal [
          "Rank" :: String, "Section", "Repo", "Stars", "Forks", "Language"
          , "Url", "User", "Issues", "Latest"
        ]
    "results" -> case context of
      Left _ -> Gi.toGVal ("error" :: T.Text)
      Right vcProjs -> Gi.toGVal $ Vc.toList vcProjs
    _ -> Gi.toGVal ("error" :: T.Text)


Gi.htmlSource $ Gi.runGinger (Gi.makeContextHtml (derefGhProj dbVal)) (loadTemplate template_2)
GitHub TOP 100

A embellished presentation the top starred and/or forked repos of GitHub for the last 5 years.

Rank Section Repo Stars Forks Language Url User Issues Latest Description
7 top-100-forks Complete-Python-3-Bootcamp 25295 81802 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 120 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25267 81756 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 119 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25328 81847 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 120 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25313 81832 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 120 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25276 81774 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 119 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25333 81862 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 120 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25274 81761 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 119 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25318 81840 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 120 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25280 81786 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 120 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy
7 top-100-forks Complete-Python-3-Bootcamp 25304 81816 Jupyter Notebook Complete-Python-3-Bootcamp Pierian-Data 120 2023-11-24 14:21:32 Course Files for Complete Python 3 Bootcamp Course on Udemy

That's it! Now you know how to interact with a SQL DB server in a safe and efficient way, convert data back and forth between Haskell structures and DB tables, and it's also easy to create HTML out of datasets from Jinja templates.

Javascript as a DSL

For fun, let's look at another DSL: Javascript. Yep, it's possible to write JS code as part of the Haskell code, have it syntax-checked and then sent to a NodeJS engine for processing.

I don't bother to describe what is going on, but at this point you should be able to understand the whole sequence of code.

In [72]:
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE ViewPatterns #-}
In [73]:
import Control.Exception (try, SomeException, evaluate)
import Control.Concurrent (threadDelay)

import Data.Aeson ( FromJSON )
import GHC.Generics ( Generic )
import Language.JavaScript.Inline
import Language.JavaScript.Inline.Core
import qualified Data.ByteString.Lazy as LBS
import Data.Text (Text, unpack)

import Control.Monad (ap)
import GHC.TypeLits (KnownSymbol)
import Data.Text.Array (run)
import Data.Vector.Generic.Mutable (set)
In [74]:
newtype JSReturn = JSReturn {
    result :: String
  } 
  deriving (Generic, Show, FromJSON)
  deriving FromJS via (Aeson JSReturn)
In [75]:
runJSTest :: IO JSReturn
runJSTest = do
  session <- newSession defaultConfig
  putStrLn "@[runJSTest] starting."
  rezA <- eval session [js|
    console.warn("@[runJSTest] JS starting.")
    let a = "allo"
    let b = "byebye"

    doTest = async () => {
      console.warn("@[runElmTest] JS doTest start.")
      setTimeout(() => console.warn("timeout!"), 5 * 1000)
      console.warn("@[runElmTest] JS doTest end.")
      return { "result": "run_JS_1: " + a + ", " + b}
    }
    finalRez = await doTest()
    console.warn("@[runJSTest] JS ending.")
    return finalRez
   |]
  closeSession session
  putStrLn "@[runJSTest] finishing."
  pure rezA
In [76]:
runJSTest
@[runJSTest] starting.
@[runJSTest] JS starting.
@[newSession] msg : JSEvalResponse {jsEvalRespon@[runElmTest] JS doTest start.
seId = 288@[runElmTest] JS doTest end.
7, jsE@[runJSTest] JS ending.
valResponseContent = Right "{\"result\":\"run_JS_1: allo, byebye\"}"}
timeout!
@[runJSTest] finishing.
JSReturn {result = "run_JS_1: allo, byebye"}
In [77]:
runJS_B :: IO JSReturn
runJS_B = do
  session <- newSession defaultConfig
  rezA <- eval session [js|
    console.warn("@[runJSTest] JS starting.")

    console.warn("@[runJSTest] JS ending.")
    return { "result": "JS_B: pwd = " + process.cwd() }
   |]
  closeSession session
  pure rezA
In [78]:
runJS_B
@[runJSTest] JS starting.
@[newSession] msg : JSEvalResponse {jsEvalResponseId = 2954, jsEvalResponseContent = Right "{\"result\":\"JS_B: pwd = /home/lhugo/P@[runJSTest] JS ending.
rojets/Fudd/IHaskell/IHaskell/notebooks_2\"}"}
JSReturn {result = "JS_B: pwd = /home/lhugo/Projets/Fudd/IHaskell/IHaskell/notebooks_2"}

The Javascript code does nothing useful, but it does showcase another kind of DSL with distributed execution. Interestingly you'll notice that the feedback from the Javascript code execution is asynchronous, so you can start running the runJS_B demo while the 'runJSTest` is still waiting for the timeout code to trigger.

Many language grammars have been encoded as DSL for Haskell. So you can let your imagination wonder a bit and think how interesting it is to connect let's say a large Python framework running within a Celery cluster or some Erlang application and interact with them from the Haskell code...

Enough fun with DSLs, we can now move on toward the last part of the introduction: web endpoints.

Serving some Web endpoints: the basics

The Web service architecture of the Haskell ecosystem has stabilized over the years the Web Application Interface (Wai) package. It provides a mature and efficient base layer for handling and responding to HTTP requests.

On top of WAI is another very stable package, the light-weight web server Warp. Together WAI and Warp provide a simple way to quickly prototype a web service.

Let's use the WAI/Warp pair to build a quick-and-dirty web handler, as one does in real-life to prototype an idea while testing stuff in the interpreter.

As usual, a few packages need to be brought into the namespace to move forward.

In [79]:
import qualified Control.Concurrent as Cc
import qualified Network.Wai as W
import qualified Network.Wai.Handler.Warp as Wr
import qualified Network.HTTP.Types as Ht

Then we create a basic HTTP handler called waiDemoServer which will handle the "/marco" route for incoming clients. The reply will be an unconditional responder sending back the tmplDemo HTML template content constructed earlier in the notebook.

In [80]:
waiDemoServer request sendFct =
  sendFct $ W.responseLBS
    Ht.status200
    (case Prelude.lookup "marco" $ W.requestHeaders request of
       Nothing -> []
       Just val -> [("Polo", val)])
    (LBS.fromStrict . T.encodeUtf8 . Gi.htmlSource $ tmplDemo)

There are two arguments for waiDemoServer: first the request received from the HTTP client, and second a responder function. That interface for waiDemoServer is standard stuff defined by WAI.

The behavior of our toy server will be to reply with a 200 (OK) status code together with the HTML content in tmplDemo. Additionally if the URL of the request matches the /marco route, a Polo value is sent back in the headers of the response just to show how that works.

The web server is started with a run <port> <service-function>. As we're working in a Jupyter notebook context, we'll fork the service to a sub-process that we can kill later:

In [81]:
tid <- Cc.forkIO (Wr.run 8181 waiDemoServer)
print tid
:! date
ThreadId 5467
Wed May 22 02:56:43 PM +04 2024

If you are running this notebook from the docker image, you can hit https://**host/demo1* with your web browser. In case you have direct access to port 8181 of this Jupyter host, you can hit http://**host**:8181/marco with your web browser.

But since this demo will likely be running on a virtual server hiding private ports, here's the code based on the basic HTTP client introduced earlier in the notebook to fetch the route and visualize the returned HTML right in a cell:

In [82]:
fetchDemo =
  let
    url = "GET http://localhost:8181/marco"
    request = N.setQueryString [("arg", Just "123")] $ N.parseRequest_ url
  in do
    reqMgr <- N.newManager N.defaultManagerSettings
    N.httpLbs request reqMgr

reply <- fetchDemo
ID.html . TL.unpack . TL.decodeUtf8 $ reply.responseBody
Haskell, Hidden Gem

Haskell, Hidden Gem

Name Destination
#1 ALPINE CUSTOMS #1 ALPINE CUSTOMS
1/OFF KUSTOMS, LLC 1/OFF KUSTOMS, LLC
102 IRONWORKS, INC. 102 IRONWORKS, INC.
12832429 CANADA INC. 12832429 CANADA INC.
17 CREEK ENTERPRISES 17 CREEK ENTERPRISES

Is the SVG from D3.js showing up in the second half of the HTML page? For me sometimes it does, sometimes it doesn't... Hopefully you can actually hit the route with your browser directly to get the correct result.

When you have are done with the request, run the next cell to kill the background process and keep the execution space clean.

In [83]:
Cc.killThread tid
:! date
Wed May 22 02:56:45 PM +04 2024
No description has been provided for this image

Reaching the main objective: a production grade modern web app

This is the conclusion of this notebook: a normal, modern, efficient web app built in the Haskell ecosystem, using HTMX and Tailwind in the frontend, server-side rendering of templates filled with data fetched by SQL queries in the backend and websocket low-latency communications between client and server. That plus authentication, security, distributed computing... the usual stuff.

You've already seen how to deal with JSON, how to render HTML from either server-side DOM (Blaze) or Jinja templates (Ginger), how to query a Postgresql database using embedded SQL, and how to do HTTP operations (client & handler).

Further more I hope you have reached this cell in a short amount of time.

We'll wrap-up everything covered so far into a basic modern web application by using the Servant package. Servant a mature web app server based on type combinators. Finally, here are the types! And this is not simple typing stuff... Servant uses types as programming logic rather than just for expressing data structures and constraints for functions. That will give you a glimpse of how powerful the type system of Haskell is compared to other programming languages and why it is so interesting as a programming language.

Servant type-based approach to endpoint definition provides verifiable, composable and extendable route specfication for web services. Beside the usual advantages of strongly-typed code, type-derived logic can automatically deduce route extension rules, apply security handlers, adapt request & reply behaviors based on middleware interactions, convert automatically data payloads from and to clients, generate client-side code and compliance testing code, and the list goes on.

If only OpenAPI was based on Haskell...

Once you've worked with Servant, you will find that the popular solutions of other ecosystems are simplistic, error-prone and inefficient. NextJS, Chi, Gin, Tokio, Rocket, Express, Flask, you name it, they just don't have the tools available in the Haskell ecosystem to compete.

In order to implement all the functionality in the notebook, we'll have to deal with a few limitations of the environment. For some reason, I can't get the local module import to work if the module hasn't been implemented within the notebook. For that reason the WebServer.CorsPolicy and WebServer.JWT module definnitions are in the first few cells of the notebook, which you need to be run for this part to work. In normal conditions you would simply import the two modules from files.

I put the language mode settings and the long list of imported packages as collapsed cells. You can have a look at the code if you want to, but it's really not necessary for getting in the main part of the implementation.

Then to insert markdown cells for documenting the code I also had to define every symbol (function, type) before it is referred. That forced me to lay out the code in the opposite order that I wouldd normally write it. Normally I would go top-down and write the main logic (listen) first then define more and more of the details referred to by the main logic. But here it's bottom-up order. Probably that some blocks of code will seem pretty abstract when they show up, and you'll have to wait until you read the code that uses them to better understand how the Servant system works.

In [84]:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
In [85]:
import Control.Concurrent.Async (concurrently_)
import Control.Exception (bracket)
import Control.Monad (forever, void)
import Control.Monad.Catch (MonadCatch, MonadMask, MonadThrow)
import Control.Monad.Cont (runContT, ContT (..), Cont)
import Control.Monad.Except (ExceptT, MonadError, withExceptT)
import Control.Monad.IO.Class (MonadIO, liftIO)
import Control.Monad.Reader (ReaderT, MonadReader, runReaderT, asks)
import Control.Exception.Safe (tryAny)

import Data.List.NonEmpty (NonEmpty (..))
import qualified Data.Int as DI

import Data.Aeson (FromJSON (..), ToJSON (..), Value (Object), (.:), (.:?), eitherDecode)

import Network.Wai.Handler.Warp as Wr
import Network.Wai.Parse (setMaxRequestKeyLength, defaultParseRequestBodyOptions)
import Network.HTTP.Media ((//), (/:))
import Network.Wai.Middleware.RequestLogger (logStdout)
import Network.Wai.Middleware.Servant.Errors (errorMw)
import qualified Network.WebSockets as Ws

import GHC.Stack (HasCallStack)
import Servant as Srv
import Servant.API.Generic
import Servant.API.ContentTypes (FormUrlEncoded)
import Servant.Auth.Server (Auth, AuthResult (..), BasicAuth, BasicAuthCfg, CookieSettings (CookieSettings, cookieIsSecure)
                  , IsSecure (NotSecure), FromBasicAuthData, JWT, JWTSettings, FromJWT (..), ToJWT (..), cookieIsSecure
                  , defaultCookieSettings, defaultJWTSettings )
import qualified Servant.Auth.Server as Sauth
import Servant.Multipart (defaultMultipartOptions, MultipartOptions (..), Tmp)
import Servant.Server.Generic (AsServerT, genericServerT)
import Servant.API.WebSocket (WebSocket)
import Web.FormUrlEncoded (FromForm (..))

import System.Posix.Signals as Sgnl

import qualified Text.Blaze.Htmx as A
import qualified Text.Blaze.Html.Renderer.Utf8 as H
import qualified Text.Blaze.Htmx as X
import qualified Text.Blaze.Htmx.WebSockets as X

In a normal coding environment, the helper logic is brought into a the main definitions, like this (remember that the WebServer modules are defined in the first cells of the notebook):

In [86]:
-- Bring in the Cors Policy & JWT helpers to configure the HTTP handler.

import WebServer.CorsPolicy (setCorsPolicy)
import WebServer.JWT (generateKeyPairIO, readJWK)

import qualified Options.Runtime as Ropt

Forced to use the bottom-up code introduction, here are some data structures that will be used in the main code...

The AppEnv structure will represent the global state of the web service, and hold:

  • the JWT settings used most importantly to handle the encoding/decoding of JW tokens in HTTP requests,
  • the execution options that control the setup of the web server (in this example we'll only use a port value),
  • the global DB context to pass to handlers so they can do SQL operations.
In [87]:
data AppEnv = AppEnv {
    jwtSettings :: JWTSettings
  , rtOptions :: Ropt.RunOptions
  , dbCtxt :: Maybe DbC.Connection
  }

The HTML type is introduced as a placeholder, that is no actual structure is attached to it. We then use it to tag the result of an endpoint handler with a MIME type corresponding to text/html. It ensures that the requesting client will render the HTML instead of showing its text.

The RawHtml is a demo of how to pass results back up the responder chain; in this case we'll only use it to hold some flat content, ie a long string that is a HTML page that needs to be sent back to the client. But in real-life you could see a more complex kind of definition where the instance MimeRender HTML RawHtml logic extension adds all kind of post-processing functionality to produce the response that is actually sent back to the client.

In [88]:
data HTML = HTML
newtype RawHtml = RawHtml { rawContent :: LBS.ByteString }

instance Accept HTML where
  contentType _ = "text" // "html" /: ("charset", "utf-8")

instance MimeRender HTML RawHtml where
  mimeRender _ = rawContent

Note: the newtype keyword is used in this example to define RawHtml instead of the usual data keyword. A newtype still creates a type, but it's a simpler kind of structure that has less runtime overhead. For example, a newtype data structure can only contain one value in a record. Using data would also work but doing so gets the compiler to complain that a newtype is better...

To handle secure data exchange between the client and the server, we pass a JWT that holds securely the SessionContext structure. The conversion from/to JSON and the encryption/decryption are implemented by doing a derivation of the FromJSON, ToJSON, FromJWT and ToJWT typeclasses. That makes everything work automagically.

In [89]:
newtype SessionContext = SessionContext {
    sessionID :: DI.Int32
  }
  deriving stock Generic
  deriving anyclass (FromJSON, ToJSON, FromJWT, ToJWT)

For a client to initially authenticate itself, it will send a simple JSON block with a identity and trust values. These values are manipulated in the code with the LoginForm structure. Similarly the LoginResult structure is used when the authentication succeeds in sending a JSON block with a context and an encrypted JWT back to the client. Again the FromJSON and ToJSON derivations take core of the conversion process automagically.

In [90]:
data LoginForm = LoginForm {
  identity :: T.Text
  , trust :: T.Text
  }
  deriving stock (Eq, Show, Generic)
  deriving anyclass (ToJSON, FromJSON)

data LoginResult = LoginResult {
    context :: SessionContext
    , jwt :: T.Text
  }
  deriving stock Generic
  deriving anyclass (ToJSON)

The web app has an endpoint providing a search feature. The client will use a HTML form element to implement the capture and transmission of the search needle, and the SearchContent structure represents just that and it is derived with FromForm to trigger automatically the conversion from the HTTP request application/x-www-form-urlencoded data into a Haskell Text value.

In [91]:
newtype SearchContent = SearchContent {
    needle :: T.Text
  }
  deriving stock (Eq, Show, Generic)
  deriving anyclass (FromForm)

With these data structures out of the way, we get to the important part of the web server implementation: the route definitions. Using Servant combinators, the routes are presented as a set of structure definitions, the top-level one being ServerRoutes.

Servant routing specification is great for large-scale systems. To demo that point, we split the routing at the top-level into two branches: anonymous and authenticated. Nothing special happens for the routes that are part of the anonymous branch and the definition just continues at the AnonymousRoutes type. But for the routes defined in the authenticated branch, some standard pre-processing logic will occur before their logic is executed.

Ok here's the cool stuff. The line of code:

route :- Auth '[JWT, Sauth.BasicAuth] SessionContext :> ToServantApi AuthenticatedRoutes

is a type expression, not just a type definition. The exotic :-, '[ ] and :> symbols are actually type functions defined by Servant; remember that Haskell gives developer a lot of flexibility in function naming. These functions operate on types rather than values. For the heavy-duty functional folks, that's 2nd order logic.

The code states that a route construct is extended (:-) so that the request handling logic first figures out the authentication (Auth) using JWT and BasicAuth ('[JWT, Sauth.BasicAuth]) and carries that into the session information (SessionContext) before chaining on to the next level of route definitions provided by the AuthenticatedRoutes.

The end-result of Servant advanced type operations is that we can assemble routes in logical groups which can be named, we can cluster and extend route definitions, we can specify formally the data structures that will come in and out of the HTTP request processing and handling, and we can get automatic complex conversions between client and server-side data. As the web app grows, this provides significant time saving during refactoring, cleaning and testing.

The route definitions. Code first, description after:

In [92]:
data ServerRoutes route = ServerRoutes {
    anonymous :: route :- ToServantApi AnonymousRoutes
    , authenticated :: route :- Auth '[JWT, Sauth.BasicAuth] SessionContext :> ToServantApi AuthenticatedRoutes
  }
  deriving (Generic)


data AnonymousRoutes route = AnonymousRoutes { 
    login :: route :- "inlogin" :> ReqBody '[JSON] LoginForm :> Post '[JSON] LoginResult
    , staticLink :: route :- "ph_1/xstatic" :> CaptureAll "path" String :> Get '[HTML] RawHtml
    , stream :: route :- "stream" :> WebSocket
    , search :: route :- "xsearch" :> ReqBody '[FormUrlEncoded] SearchContent :> Post '[HTML] RawHtml
    , homePage :: route :- Get '[HTML] RawHtml
  }
  deriving (Generic)


newtype AuthenticatedRoutes route = AuthenticatedRoutes { 
    privateAccess :: route :- "private" :> ToServantApi PrivateRoutes
  }
  deriving (Generic)


newtype PrivateRoutes route = PrivateRoutes { 
    getPage :: route :- "static" :> Capture "path" String :> Get '[HTML] RawHtml
  }
  deriving (Generic)

The AnonymousRoutes specifies 5 routes. You can probably already guess what is being defined if you read the 5 type definitions, but here is the explanation:

  • the login route is at /inlogin (first part), it expects a payload of JSON from the client compabitle with the LoginForm, it comes in as a POST request and its handler will provide a LoginResult that will sent back as a JSON block. That's the route a client uses to authenticate itself.
  • the staticLink route is at /xstatic, it expects more segments in the URL and these will be globbed up as a list of strings (CaptureAll ...; the path identifier is just a placeholder), it comes as a GET request and the handler will return some long text that the client will consume as text/html content (GET '[HTML] RawHtml).
  • the stream route is at /stream, and it's going to operate as a websocket connection.
  • the search route is at /xsearch, it expects a urlencoded form in the URL in the form of SearchContent, it comes in as a POST request and the handler will return some long text that the client (it's starting to be repetitive).
  • the homePage route is /, it simply comes in as GET request and returns text/html, ie a the HTML for the home page.

The AuthenticatedRoutes is a grouping for handling requests to the /private path prefix. I wanted to showcase the convenient structuring and extension of route definitions in Servant, but I don't want to make the API too long, so there's only a very simple continuation in the group, defined by the PrivateRoutes type. That last definition, PrivatesRoutes, provides a getPage route. As it is an extension, the actual URL for that route will be /private/static. Like the xstatic route it is a globber, it comes in as a GET and the handler will return a HTML page.

In real life you'd expect to have many more route definitions in the authenticated group and probably many levels of definitions. For an introductory example, we have enough.

Now that routes are specified as types, the entire conformance and meta-programming power of Haskell is available to analyse the specifications and activate all kind of further processing. There's no extra description file required, no pre-parsing or conversion step, no boiler-plate code generation required.

The meta-programming features can also integrate popular 3rd party API specs in the Servant framework. For example if you have an API specified as a GraphQL file, the compiler can read the spec file directly (Morpheus) as the base for Servant route definitions. There are no intermediate file generated, no rewriting functions, etc; you can just focus on adding handler logic. There are similar solutions for OpenAPI files.

Specific error management for our handlers is defined by the DemoServiceError structure. When a request handler hits an issue, it throws an error specified in DemoServiceError. The Servant logic will do the rest to send a proper response to the client, using the asHttpError function to convert from our specific web app issues to standard HTTP errors.

In [93]:
data DemoServiceError
  = NotImplemented
  | UnexpectedError T.Text
  | NotAuthorized T.Text
  | Unaccessible
  | NotFound T.Text
  deriving stock (Generic, Show, Eq)
  deriving anyclass (ToJSON, FromJSON)


asHttpError :: DemoServiceError -> ServerError
asHttpError err =
  case err of
    Unaccessible -> err401 { errBody = "Resource not accessible." }
    NotImplemented -> err500 { errBody = "Not Implemented." }
    UnexpectedError x -> err500 { errBody = textToLBS x }
    NotAuthorized x -> err401 { errBody = textToLBS x }
    NotFound x -> err404 { errBody = textToLBS x }
  where
    textToLBS = LBS.fromStrict . T.encodeUtf8

Finally we define the WebApp structure to hold an execution state for route handlers invoked by Servant main logic. It's rather a technical detail, but importantly it connects the global data context AppEnv and the specific error management DemoServiceError with the request handlers execution. The WebApp type is automagically bound to the result of each route handler as per its specific route definition.

Note: the WebApp type is deriving lots of Monad stuff because the Monad meta-programming is Haskell way of dealing with real life processing and state management, ie changing context and non-pure execution.

In [94]:
newtype WebApp a = WebApp { 
    runApp :: ReaderT AppEnv (ExceptT DemoServiceError IO) a
  }
  deriving newtype (
    Functor, Applicative, Monad, MonadMask, MonadCatch, MonadThrow
    , MonadReader AppEnv, MonadIO, MonadError DemoServiceError
  )

We now get to the first route handler, loginHandler. It gets a LoginForm as a parameter and returns a LoginResult. Due to the bottom-up rather than top-down code presentation, the function will be linked to the /inlogin endpoint in code that will show up further down in the handler associations section.

As this is a demo without a real authentification system, the loginHandler simply pulls out JWT stuff with:

asks jwtSettings

which extracts the info from the jwtSettings field that is held in the AppEnv value (Haskell monad magic at work). If there's no problem doing so, it builds a new encrypted token with a fake SessionContext and puts that into a LoginResult value that is the result of the loginHandler call. If there's a problem, an UnexpectedError is thrown.

In a real life application, the authentication logic would be provided in the AppEnv value, and it would be used to assess the validity of the login data. Alternatively a 3rd-party authentication system could have already provided a JWT for the client and then that token would simply be verified by loginHandler, ie there would be no LoginForm required and instead the route definition would specify how the JWT is received (in headers, as a URL parameter, etc).

In [95]:
loginHandler :: LoginForm -> WebApp LoginResult
loginHandler form = do
  settings <- asks jwtSettings
  jwtRez <- liftIO $ Sauth.makeJWT (SessionContext 1) settings Nothing
  case jwtRez of
    Left err ->
      throwError . UnexpectedError . T.pack $ show err
    Right jwtValue ->
      pure $ LoginResult {
          context = SessionContext 1
          , jwt = T.decodeUtf8 . LBS.toStrict $ jwtValue
        }

The homePage handler loads up a Jinja template, puts some values in its render context from the DB, and sends that back as the result to the client. In a real life application there would be some caching and/or session pooling to use the DB resource more efficiently.

In [96]:
homePageHandler :: WebApp RawHtml
homePageHandler = do
  template <- liftIO $ Prelude.readFile "ph_1/xstatic/templates/templ_4.html"
  let
    genHtml = Gi.htmlSource $ Gi.runGinger (Gi.makeContextHtml derefLabelB) (loadTemplateB template)
  pure . RawHtml . LBS.fromStrict . T.encodeUtf8  $ genHtml

derefLabelB needle =
  Gi.toGVal $ Mp.lookup needle demoData_1

loadTemplateB aTemplate = do
  either (error . show) id . runIdentity $
    Gi.parseGinger (const $ return Nothing) Nothing aTemplate

The searchHandler is the endpoint logic that responds to a HTMX request, so it replies with HTML-formatted content in which values are fetched from the DB:

In [97]:
searchHandler :: SearchContent -> WebApp RawHtml
searchHandler searchContent = do
  mbDbCtxt <- asks dbCtxt
  dbVal <- case mbDbCtxt of
    Just conn -> do
      rezA <- liftIO $ Se.run (simpleQuery searchContent.needle) conn
      case rezA of
        Left err -> pure . Left $ show err
        Right mbVal -> pure $ Right mbVal
    Nothing -> throwError . UnexpectedError $ "No database available."
  let
    genHtml = Gi.htmlSource $ Gi.runGinger (Gi.makeContextHtml (derefGhProj dbVal)) (loadTemplate template_2)
  pure . RawHtml . LBS.fromStrict . T.encodeUtf8  $ genHtml

The anonStaticHandler simply loads up files from the xstatic/assets local Jupyter server directory and sends that data back to the client as a text/html reply.

In [98]:
anonStaticHandler :: [ String ] -> WebApp RawHtml
anonStaticHandler pageUrls = do
  let
    fullPath = Prelude.foldl (\accum segment -> accum <> "/" <> segment) "ph_1/xstatic" pageUrls
  -- liftIO deals with getting the right order of IO and WebApp monadic contexts:
  pageContent <- liftIO $ LBS.readFile fullPath
  pure . RawHtml $ pageContent

The prvStaticHandler does the same job as the anonStaticHandler, but as it's in the private group or routes, it gets authentication and session information as parameters (the AuthResult SessionContext part). It would then allow/block or modify content access based on that. For the sake of simplicity of this demo, it ignores that information and just returns some HTML.

In [99]:
prvStaticHandler :: AuthResult SessionContext -> String -> WebApp RawHtml
prvStaticHandler authResult pageUrl =
  case authResult of
    Authenticated context ->
      -- TODO: fetch an interesting piece of information in HTML.
      let
        content = H.renderHtml $ H.div $ H.toHtml $ T.pack . show $ context.sessionID
      in
        pure $ RawHtml content
    _ ->
      throwError . NotAuthorized . T.pack $ pageUrl

Finally we have the streamHandler handler. It receives requests from clients over a websocket connection. It replies on the same connection with HTML-formatted data, as it is intented to be invoked from HTMX-enabled elements:

In [100]:
streamHandler :: MonadIO m => Ws.Connection -> m ()
streamHandler conn = do
  liftIO $ Ws.withPingThread conn 30 (pure ()) $ do
    -- if you want to inspect streaming: liftIO $ Ws.sendTextData conn ("<div id=\"notifications\" hx-swap-oob=\"beforeend\">Some message</div?" :: ByteString)
    handleClient
  where
    handleClient = do
      rezA <- tryAny $ forever receiveStream
      case rezA of
        Left err -> do
          liftIO . putStrLn $ "@[streamHandler] situation: " <> show err
          closeConnection
        Right _ -> do
          liftIO $ putStrLn "@[streamHandler] client disconnected."
          pure ()

    receiveStream = do
      rezA <- Ws.receiveDataMessage conn
      case rezA of
        Ws.Text msg decodedMsg ->
          let
            hxMsg = eitherDecode msg :: Either String HxWsMessage
          in
          case hxMsg of
            Left err -> do
              putStrLn $ "@[receiveStream] invalid HxWsMessage: " <> (T.unpack . T.decodeUtf8 . LBS.toStrict) msg
              putStrLn $ "@[receiveStream] error: " <> show err
            Right hxMsg ->
              Ws.sendTextData conn $ H.renderHtml $ htmxReply hxMsg.wsMessage
        Ws.Binary msg ->
          putStrLn "@[receiveStream] received binary."
    
    closeConnection = do
      Ws.sendClose conn ("Bye" :: ByteString)
      void $ Ws.receiveDataMessage conn

    htmxReply aMessage =
      H.tbody H.! A.id "notifications" H.! X.hxSwapOob "beforeend" $ do
        H.tr $ do
          H.td H.! A.class_ "px-6 py-4 whitespace-nowrap text-sm text-slate-900" $ H.toHtml aMessage
                

data HxWsHeaders = HxWsHeaders {
    request :: T.Text
    , trigger :: T.Text
    , triggerName :: Maybe T.Text
    , target :: T.Text
    , currentURL :: T.Text
  }
  deriving stock (Show, Generic)

instance FromJSON HxWsHeaders where
  parseJSON (Object obj) = HxWsHeaders <$>
    obj .: "HX-Request"
    <*> obj .: "HX-Trigger"
    <*> obj .:? "HX-Trigger-Name"
    <*> obj .: "HX-Target"
    <*> obj .: "HX-Current-URL"


data HxWsMessage = HxWsMessage {
    wsMessage :: T.Text
    , headers :: HxWsHeaders
  }
  deriving (Show, Generic)


instance FromJSON HxWsMessage where
  parseJSON (Object obj) = HxWsMessage <$>
    obj .: "ws-message"
    <*> obj .: "HEADERS"

The hxWsMessage and HxWsHeaders structures receive the data sent by the HTMX logic on the client. They extend the FromJSON typeclass logic to match the format used by HTMX to send the client's data.

Note: the flexibility of JSON conversion provided by the FromJSON typeclass results in more code than for example the Golang struct tags ('json:...'). In this notebook it's worthy to expose the conversion mechanisms for learning purposes. But of course there's a package out there that provides a camel-case formating function to achieve the same pre-defined field name encoding as in the Golang tags.

We're almost done! After having defined the routes and handlers, we connect them together in the next block of code. First we specify the functions that use Servant's genericServerT combinator, second we define a value for each of the route types we've created a few cells ago, and third we associate each handler function that implement an endpoint.

The ultimate glue point for all these specifications to Servant general request handling logic is the weird looking

serverApiProxy :: Proxy (ToServantApi ServerRoutes)
serverApiProxy = Proxy

It's a typing trick! The important part is the type definition Proxy (ToServantApi ServerRoutes) for the function serverApiProxy. That will be analyzed by the compiler to figure out all kind of futher logic expansion using the definitions of ServerRoutes. The function implementation is just providing a generic value of that Proxy ... type, using the constructor also named Proxy. The funciton itself has no real use.

In [101]:
-- Handler associations:
serverApiProxy :: Proxy (ToServantApi ServerRoutes)
serverApiProxy = Proxy


serverApiT :: ToServant ServerRoutes (AsServerT WebApp)
serverApiT =
  genericServerT $ ServerRoutes {
    anonymous = anonHandlers
    , authenticated = authHandlers
  }


anonHandlers :: ToServant AnonymousRoutes (AsServerT WebApp)
anonHandlers =
  genericServerT $ AnonymousRoutes {
    login = loginHandler
    , staticLink = anonStaticHandler
    , stream = streamHandler
    , homePage = homePageHandler
    , search = searchHandler
  }


authHandlers :: AuthResult SessionContext -> ToServant AuthenticatedRoutes (AsServerT WebApp)
authHandlers authResult =
  genericServerT $ AuthenticatedRoutes {
    privateAccess = prvStaticHandler authResult
  }

As the web service uses form/url-encoded authentication, there's a logic extension provided for our SessionContext structure that could do some validation and book-keeping. But in this demo it will just defer to the default validation function authCheckFun.

In [102]:
type instance BasicAuthCfg = Srv.BasicAuthData -> IO (AuthResult SessionContext)
instance FromBasicAuthData SessionContext where
  fromBasicAuthData authData authCheckFun = authCheckFun authData

We have reached the operation side of running a web service! First, the WAI control is defined, where the listen port, the pre-processing logic (showBanner!) and the process interruption management are defined (shutdownHandler).

In [103]:
setupWai :: Int -> IO () -> Settings
setupWai port shutdownCallback =
  Wr.setPort port . Wr.setGracefulShutdownTimeout (Just 5) . Wr.setInstallShutdownHandler shutdownHandler
    . setBeforeMainLoop showBanner
    $ Wr.defaultSettings
  where
    showBanner =
      putStrLn $ "@[setupWai] using port: " <> show port
    shutdownHandler closeSocket = do
      void $ installHandler Sgnl.sigTERM (Catch $ shutdownCallback >> closeSocket) Nothing
      void $ installHandler Sgnl.sigINT (Catch $ shutdownCallback >> closeSocket) Nothing
      void $ installHandler Sgnl.sigQUIT (Catch $ shutdownCallback >> closeSocket) Nothing
      void $ installHandler Sgnl.sigHUP (Catch $ shutdownCallback >> closeSocket) Nothing

We now have reached the definition of the overall bootstrap logic for the servicing, the runAPI function. It takes care of setting up the JWT encryption components, the DB connection, it puts the middlewares in place (logging, multi-part requests handler, CORS policy handlers), creates the global context appEnv that each handler will work with, builds the main server loop (hostServerWithContext ...) and finally returns everything linked together.

In [104]:
runAPI ::  Ropt.RunOptions -> IO Application
runAPI rtOpts = do
  -- Initialize the JWT parameters:
  myKey <- case rtOpts.jwkConfFile of
    Nothing ->
      generateKeyPairIO "/tmp/jwk.json"
    Just aPath ->
      readJWK aPath

  -- Initialize the DB parameters:
  eiConn <- DbC.acquire $ DbC.settings "aws-0-eu-central-1.pooler.supabase.com" 5432 "demoa.vtjcwbccdutdqcsxdaqr" "justForDemo" "demoha"
  case eiConn of
    Left err ->
      putStrLn $ "@[runAPI] db connection err: " <> show err
    Right _ -> pure ()

  -- pure computations:
  let
    cookieCfg = defaultCookieSettings { cookieIsSecure = NotSecure }
    jwtDefSettings  = Sauth.defaultJWTSettings myKey
    -- For file upload support, will be used in next demo:
    multipartOpts = (defaultMultipartOptions (Proxy :: Proxy Tmp)) { 
          generalOptions = setMaxRequestKeyLength 512 defaultParseRequestBodyOptions
      }

    -- Define the overall Servant pre-processing required on requests:
    runContext = cookieCfg :. jwtDefSettings :. fakeSessionValidation :. multipartOpts :. EmptyContext
    runCtxtProxy = Proxy :: Proxy '[CookieSettings, JWTSettings, BasicAuthCfg]

    -- activate some middlewares for loggin, CORS, standardized error handling:
    middlewares = linkUp $ id :| case rtOpts.corsPolicy of
      Nothing -> [ logStdout, errorMw @JSON @'["message", "status" ] ]
      Just aPolicy -> [ logStdout, setCorsPolicy aPolicy, errorMw @JSON @'["message", "status" ] ]

    mbDbConn = case eiConn of Left _ -> Nothing ; Right aConn -> Just aConn
    -- define the global runtime context for handlers:
    appEnv = AppEnv { jwtSettings = jwtDefSettings, rtOptions = rtOpts, dbCtxt = mbDbConn }
    -- link all that stuff in Servant magic:
    server = hoistServerWithContext serverApiProxy runCtxtProxy (toHandler appEnv) serverApiT

  -- launch the servicing:
  pure $ middlewares $ serveWithContext serverApiProxy runContext server
  where
    linkUp :: NonEmpty (a -> a) -> a -> a
    linkUp = Prelude.foldr1 (.)

    -- | Boilerplate transformation between our service structures and Servant's 'Handler' monad:
    toHandler :: AppEnv -> WebApp a -> Srv.Handler a
    toHandler e =
      Handler . withExceptT asHttpError . flip runReaderT e . runApp

    fakeSessionValidation :: BasicAuthData -> IO (AuthResult SessionContext)
    fakeSessionValidation _ =
      pure $ Authenticated $ SessionContext 1

The main function for executing the service is listen. It's using the Continuations package, which is Haskell ecosystem main formalization for a high-level manager of complex processing environments that include multiple processes, interruptions, initialization and cleanup requirements, and etc. The function is most importantly a call to runContT which in turn will invoke:

webHandling <- runAPI rtOpts

that sets up all the servicing logic for execution, and then

runSettings settings webHandling

that will create the WAI HTTP listener and starts Servant main request processing logic.

Additionally it defines what happens before, after, how to gracefully shutdown the process, etc, which in this demo is all mock logic.

In [105]:
listen :: Ropt.RunOptions -> IO ()
listen rtOpts = do
  let
    fakeContT = ContT $ bracket (fakeContFct "dummy.") fakeEndFct
  runContT fakeContT finalAction
  where
  finalAction dummy = do
    let shutdownCallback = putStrLn "@[finalAction] empty termination callback."
        settings = setupWai rtOpts.webServer.port shutdownCallback
    webHandling <- runAPI rtOpts
    Wr.runSettings settings webHandling

  fakeContFct :: [a] -> IO Int
  fakeContFct l = return (Prelude.length l)

  fakeEndFct :: Int -> IO ()
  fakeEndFct aNum = pure ()

Like for the WAI/Warp example before, running the web service is done in a sub-thread with a call to the forkIO function. The next lines spawn a full blown web server that is a realistic template for advanced and modern web app implementation:

In [106]:
defaultRun :: RunOptions
defaultRun =
  RunOptions {
    debug = 0
    , webServer = WebServerOptions {
        port = 8180
        , host = "localhost"
      }
    , jwkConfFile = Nothing
    , corsPolicy = Nothing
  }
In [107]:
tid <- Cc.forkIO (listen defaultRun)
:! date
Wed May 22 02:56:58 PM +04 2024
In [218]:
Cc.killThread tid
:! date
Wed May 22 01:49:23 PM +04 2024

STOP HERE FOR NOW.

Let's use the route definition defined in the previous example to generate automatically a client for the API:

In [ ]:
import Servant.Client
import Servant.Client.Core.Request

instance HasClient ClientM WebSocket where
  type Client m (Stream method status framing ct (Headers hs a)) = m (Headers hs a)

  hoistClientMonad _ _ f ma = f ma

  clientWithRoute _pm Proxy req = withStreamingRequest req' $ \gres -> do
      let mimeUnrender'    = mimeUnrender (Proxy :: Proxy ct) :: LBS.ByteString -> Either String chunk
          framingUnrender' = framingUnrender (Proxy :: Proxy framing) mimeUnrender'
      val <- fromSourceIO $ framingUnrender' $ responseBody gres
      return $ Headers
        { getResponse = val
        , getHeadersHList = buildHeadersTo . toList $ responseHeaders gres
        }

    where
      req' = req
          { requestAccept = fromList [contentType (Proxy :: Proxy ct)]
          , requestMethod = reflectMethod (Proxy :: Proxy method)
          }

loginC :<|> staticC :<|> streamC :<|> homeC :<|> searchC :<|> privStaticC = client serverApiProxy
Eta reduce
Found:
hoistClientMonad _ _ f ma = f ma
Why Not:
hoistClientMonad _ _ f = f
<interactive>:2:3: error:
    • Type indexes must match class instance head
      Expected: Client ClientM WebSocket
        Actual: Client m (Stream method status framing ct (Headers hs a))
    • In the type instance declaration for ‘Client’
      In the instance declaration for ‘HasClient ClientM WebSocket’

¶

References

  • GitHub Top 100
  • Youtube Embedding
  • IHaskell Kernel for Jupyter