Reference

Unique Phrases

Tabernacles of Clay

Book of Mormon

Mosiah 3:5:

... the Lord Omnipotent ... shall come down from heaven among the children of men, and shall dwell in a tabernacle of clay

Moroni 9:6:

... we have a labor to perform whilst in this tabernacle of clay, that we may conquer the enemy of all righteousness, and rest our souls in the kingdom of God.

The seemingly uniquely-Mormon phrase “tabernacles of clay” (above) is a strange beast. The three-word phrase is not found in the King James Version (KJV) of the English translation of the Bible. The phrase mixes the Old Testament concept of the “tabernacle” (the Jewish predecessor to the temple of Solomon which the children of Israel carried with them in the wilderness to worship the Lord) and especially Paul's reference to the mortal body as “our earthly house of this tabernacle” in 2 Cor. 5:1—with Old and New Testament phraseology about people metaphorically being or dwelling in earthen vessels, jars of clay, potter's vessels, etc. to express metaphorically that the human spirit dwells in a very temporary and flawed and breakable, mortal house of clay, especially in comparison to The Maker, The Potter, e.g. God. (This “tabernacle = temple” concept also reminds us also of the phrase “the body is a temple.”)

2 Cor. 5:1:

For we know that if our earthly house of this tabernacle were dissolved, we have a building of God, an house not made with hands, eternal in the heavens.

Old Testament

Jeremiah 18:4:

And the vessel that he made of clay was marred in the hand of the potter: so he made it again another vessel, as seemed good to the potter to make it.

Isa. 45:9:

Woe unto him that striveth with his Maker! Let the potsherd strive with the potsherds of the earth. Shall the clay say to him that fashioneth it (other translations, say to the potter), What makest thou? or thy work (say to the potter, e.g. God), He hath no hands?

Job 4:19:

How much less in them that dwell in houses of clay, whose foundation is in the dust, which are crushed before the moth?

Job 13:12:

Your remembrances are like unto ashes, your bodies to bodies of clay.

New Testament

Romans 9:21 (reference Isa. above):

Nay but, O man, who art thou that repliest against God? Shall the thing formed say to him that formed it, Why hast thou made me thus? Hath not the potter power over the clay, of the same lump to make one vessel unto honour, and another unto dishonour?

2 Corinthians 4:7–11:

But we have this treasure in earthen vessels (also alternately translated jars of clay), ... Always bearing about in the body the dying of the Lord Jesus, that the life also of Jesus might be made manifest in our body. For we which live are alway delivered unto death for Jesus' sake, that the life also of Jesus might be made manifest in our mortal flesh.

2 Cor. 5:1 (again):

For we know that if our earthly house of this tabernacle were dissolved, we have a building of God, an house not made with hands, eternal in the heavens.

1 Thess. 4:4:

That every one of you should know how to possess his vessel (translated body in NIV, ESV, etc.) in sanctification and honour.

2 Timothy 2:20:

But in a great house there are not only vessels of gold and of silver, but also of wood and of earth; and some to honour, and some to dishonour.

Rev. 2:27

And he shall rule them with a rod of iron; as the vessels of a potter shall they be broken to shivers (dashed to pieces like a potter's vessels, NKJV).

18th- & 19th-Century Preachers

The exact phrase, rendered into English in those three words, can be found in various sermons from the late 18th and early 19th centuries, showing that the phrase was in the religious zeitgeist at and around the time of the publication of the Book of Mormon:

The Critical Review: Or, Annals of Literature, Tobias Smollett, 1763, p.427:

Relate their peculiar gifts and graces, while dwelling in tabernacles of clay, they passed their mortal pilgrimage, in sacred love and pious ardour; imitating the bright example of their Saviour.

The Whole Works of the Late Rev. Mr. Ebenezer Erskine, 1798, p. 347:

Sirs, think upon it now; no sooner hath death dislodged you from these tabernacles of clay, and the eyes of your bodies shut, but that very moment you will find your souls sifted (“fifted” using obsolete typeset long s) before the awful tribunal of an infinite God, in order to have your eternal state determined.

The Whole Works of the Rev. William Bates, Vol 2, 1815, p. 297:

It is promised, “that God will dwell in us, and walk in us;” whose gracious presence is heaven upon earth. Strange condescension! that the God of glory should dwell in tabernacles of clay; far greater than if a king should dwell in a cottage with one of his poort subjects.

Joseph Hall, 1837, p.78, referencing 2 Cor. 5:1:

Let it not overgrieve us, to leave these tabernacles of stone, since we must shortly lay down these tabernacles of clay, and enter into tabernacles not made with hands, eternal in the heavens. Till then, farewell, my Dear Brethren, farewell in the Lord.

This should suffice to begin a reasonable discussion on the origin of this unique phrase in Joseph Smith's earliest scriptural production, in Mosiah 3:5 and Moroni 9:6, mentioned at the top.


Programming

More from Less

Too much has been written already about AI Art and how it is killing puppies.

But even people like me who enjoy playing with generative AI tools still don't like finding AI-art-based images mixed in with images made by humans the old-fashioned ways—by hand (analog art or digital painting, or pixel art), by camera, or by CGI. When I'm looking at photos, I don't want to see some random fake non-photo mixed in there. When I'm looking at art made by humans, I want to see art made by humans. Perhaps just share AI art in places where AI art is shared, with a prominent label. If we don't cross the streams, perhaps fewer people will get cranky.

Given that starting place, this article documents the tail-end of a long multi-year process of human-machine hybrid image creation. (I won't call it art because it makes the anti-AI knee-jerk brigade come out of the woodwork.) But assuming the reader is not some sort of Luddite, he may follow along.

And this article is not about the images themselves, per se, or if this is good or bad for artists or for society as a whole, or even about AI itself, it's about what one programmer figured out how to do with certain tools available today. This is actually a programming journey.

What Are We Even Talking About?

My goal has gradually crystalized into this: overcoming the resolution limits of image generation tools and create appealing super-high-resolution poster-sized images, up to 8k or in the 30 Megapixel to 60 Megapixel range (example 1, 2), with increasing level of detail in places that make sense.

An example image

For the purposes of this article, assume the subject is portraiture/figurative and the max level of detail on the image is at the face. The background would naturally have less detail, as in an oil painting or a photograph with a depth of field effect. This artfully tells the viewer where the gaze should return to. This requirement of actually having detail when zoomed way in (for at least some part of the image) precludes simply depending on Topaz Gigapixel AI to turn an image of 1.5 Megapixels (or as low as 0.25 Megapixels) into anything nearly big enough. Gigapixel upscaling can achieve a lot, but without the newer generative features, it can't achieve miracles. But my process can.

Original, Painfully Slow, Too-Manual Process

My initial process involved using Wombo AI art online (from an iPad) to generate an image, then cropping the image repeatedly and doing Image-to-Image to create successively more detailed areas of the image (two nested crops of the original image, the third image generated from a crop of the second). The sad part was trying to take these assets, all the same size, say three images 1080 wide by 1920 tall, and combining them into a single image without upscaling artifacts. Naively upscaling using Bilinear, Bicubic, or even Lanczos will result in "screen-door" artifacts on the "zoomed out" parts (less detailed parts) of the image. (This also create problems for the final "free" 2x upscale in Gigapixel AI—again no miracles are possible once these big jaggies are introduced.)

Necessary Evil: Manual Scaling & Alignment on my Mac

So I began using Topaz Gigapixel AI and a lot of manual calculation and nudging and scaling to get the three images (for this one very simple example) into a shared pixel space without artifacts. It can be easily done but requires this very manual, very error-prone, very boring, very mechanical step before the images can even be masked to make a single image. But Gigapixel charges $2,000 for API access (on my own machine, offline!) so I cannot automate this painful step. (... or can I? ...)

The Fun Part: Masking and Manual Painting on iPad Affinity Photo

For me, the fun part is to take the PSD file (or in my case, Affinity Photo layers document) from my Mac (the only place where Gigapixel runs) back to my iPad (where I generated the images using an online image service such as Wombo AI Art) so I can manually mask the layers together, fix problems with actual manual digital painting, and apply final adjustments such as levels, white balance, manual local dodging and burning, etc.

Enter Mochi Diffusion

In 2024, less than a year ago, I got my hands on an Apple Silicon Mac—in this case the oldest 2020 M1 MacBook Air, the cheapest Mac laptop you can buy used, a device with no fan and 20-hour battery life. Apple has shipped Apple Silicon devices with Neural Engine cores on mobile devices such as iPhone and iPad for many years now, and getting this technology on a laptop opened up some really amazing avenues. Apple ported Stable Diffusion to run on CoreML, meaning it can use GPU cores or lower-power Neural Engine cores to run SD at an almost respectable speed on commodity mobile hardware. What I mean is that you don't need a $2,000 Nvidia GPU inside a massive box with huge loud fans and a 1,000 W power supply, you can play with this technology on a device with no fan, running on battery power.

Some amazing enthusiasts have pushed this forward, in terms of native Mac UI, in the form of an open-source project called Mochi Diffusion. Getting the code for this and compiling it allows a tinkerer like the author to add many crazy new features, to begin glimpsing what people have been doing for years on dedicated kilowatt hardware using A1111.

The first feature I added was the ability to favorite images as Mac-native Finder color tags. This got accepted into Mochi Diffusion and is shipping right now.

The Latest Process

My latest process is quite convoluted, but produces the highest quality and quantity of amazing results I can muster. Think controlled chaos, with a certain amount of manual control, leaving room for wonderful surprises.

Batching Prompts & Madlib-style Canned Faces

The first feature required to get piles of useful starting places is the ability to batch run dozens or hundreds of prompts from a text file. (This was simple enough to add to my experimental branch of Mochi Diffusion and is obvious. A1111 has had this for a while now too, obviously.)

In order to get good results for portraiture/figurative images, it helps to have a pile of "canned faces" filtered from an even bigger pile of "blended faces." I won't go into detail but imagine a simple madlib Python script with a long list of pieces of text prompts which when combined generate stable, deterministic results that are unique and repeatable. These result faces vary widely but a good madlib generator might produce perhaps 25% keepers and 75% ugly faces. Generate a giant text file, run this overnight using batching, then comb through the results and favorite the best ones using the feature I added to Mochi Diffusion. Since the color tag is stored in the file system resource fork, you can just copy or move PNG files around in various folders (on Apple OSes). So your folder of copies of best PNGs is actually also a list of the best "canned face" prompts, for the next step. (You also need a command-line prompt puller Shell/Python script that can use the open source Exiv2 library to turn a folder of PNGs back into a text file).

Rows and Columns

The next trick is to use the text file of canned high quality faces prompts as rows in an even more complicated x-y batching mode (yet another feature I added to my experimental development branch). The columns are more complicated, but they involve a folder full of collected-up images with an index text file of prompt templates and strength values for those images (with a wildcard to leave a spot for the row prompt “face” in there). X-Y batching begins by shuffling all rows and runs over N random columns per row, where N is set in the UI as the Number of Images field. Columns can be thought of as poses or compositions or outfits, and the inner prompt "canned face" gets spliced into the larger column's prompt template. This batch can be run for hours or overnight to get hundreds of low-res starter images which can be curated down to dozens of great starting point images, which can be upscaled with my complicated process, below.

Columns & Cross-painting

Creating the images in the columns folder and related prompts is its own art form, and requires a lot of patience and experimentation. But this is also a huge part of how to get top notch results, building on years of collecting up awesome intermediate images, usually improved by cross-painting (manual inpainting of several nearly identical low-res images into a single optimized image) and correcting all of this messy output. Then the process can create endless variations with different faces using these poses and compositions as starting places.

Project Mode

I added an easy-to-use (but complicated-under-the-hood) Project pane to Mochi Diffusion that can help automate my fiddly upscaling process without relying on Gigapixel AI. Instead it uses the wonderful open source RealESRGAN upscaler, which is "good enough" to glue the whole process together, especially since it can all be automated.

Project Mode begins with a single low-res 512x512 PNG generated on my limited machine using just Neural Engine cores, using SD1.5 models. A single keyboard shortcut copies the prompt from the image to the UI, then the remaining parameters can be set by the user (notably Strength, which means Text Strength). The magic values that seem to work for this part of the process range from 0.25 but 0.3 but other values may work, depending on the SD CoreML model file you are using. (0.25 means 75% like the image, so details are added but mostly this is like a generative upscale at every step).


The user selects an image in Mochi Diffusion and enters Project Mode, which opens the starter PNG file (captioned "1.", a full, zoomed out composition) and creates a new unique folder where a pile of 512x512 images will be written to disk. First my code uses Apple's built-in Computer Vision tools for face and pose detection, along with tons of heuristics, (captioned "2.") to create a tree of complicated nested rectangles with explicit coordinates in the starter/original image space, all in a split second. Before doing the automated complicated process (below), the user gets to see how small the smallest rectangle is (face usually), relative to the starter image, with a scale number. For a typical figure this could be anywhere from 6x to 12x, but generally around 8x or 9x. This multiplier will set the scale for the final layered set of images and is very important in making the whole thing work without artifacts.

Project Mode has an Enqueue button so batching can be used yet again (for the third time?) after perhaps dozens of great starter images have been found. Enlarging a low-res image from 0.25 Megapixels into perhaps 20 Megapixels (so 80x for just the upscaled input image, and hundreds of times the number of total pixels, counting all the layers) takes about five minutes per project, so batching allows everything to run in succession in a fully automated fashion, perhaps overnight.

PSD for the Win

What is the output of each Project? Each project folder fills up with 512x512 images, which the Project collects location information about so it can create a giant layered PSD file at the end. Each image is first Crop/scaled from a 512x512 image before running SD Img2Img Generate on it. ("Crop/scale" is just "crop some part but scale back to 512x512" before Img2Img). In this manner, final images will have decent quality for their size, but cannot be combined into a coherent whole without first being intelligently uspcaled into the same pixel space (using RealESRGAN and not bilinear or bicubic or whatever). The final step just combines everything in the following manner: first, scale the widest image to the max scale (say, 8.5x), then next scale subsequent images in the stack of images less and less, until the final tiny face images which are like 1.25x and 1x. These images are added to a PSD document and written out to disk as a giant file, perhaps 100 MB to 500 MB depending on the full resolution. (Affinity Photo documents losslessly compress this to about 3x or 4x smaller, in practice, and PSD files themselves can be compressed more than my PSDWriter library supports.)

An example image

Here is an example of the automated PSD output of the process (captioned "3.") for one original low-res starter image, 512x512. Again this process is dependent on the fact that the "canned face" is known to look good as we zoom in. The prompt already has this information and can control the face. It's like cheating and knowing the future. The whole process is very dependent on this part of the prompt, to get controlled results, instead of requiring luck for those details. It's kind of like rendering a fractal at more and more zoomed in scales, where the formula is deterministic and simple but more detail is revealed with further computation. Similarly, we have mined the SD model for each fully structured 3-D face model "formula" that can be represented with a few hundred bytes of text, and given enough computation, can be revealed at any scale, lighting, pose, etc.


The PSD file for this is available as a ZIP here, 150 MB unzipping to 250 MB, if you want to see what the layers look like, and try turning them on and off, and imagine masking all this mess together into one beautiful image. Another reason this process works is that there is so much redundant information that the human manual masker (see below) has enough valid data to reject some of the often bizarre results that will inevitably creep into the PSD file. Those layers can simply be deleted.

I am not opposed to uspcaling a finished, adjusted image one final time, manually, using Topaz Gigapixel AI, to get the final 8k image (7680x7680), available here, 59 Mpix @ 8 MB. I think these kinds of results are stunning, and I can create several fully realized images in a day once I have the PSD files in hand, which again, is pretty easy, given all this batching and curating that goes into getting a handful of great starting point images.

Major Caveat: Manual Masking Required

This whole process is heavily dependent on manual human masking of the PSD layers at the end, as many as one or two dozen layers. The results of masking for this example are captioned "4." above, before final adjustments. I've tried to automate this using Inpainting ControlNets but the version in MochiDiffusion is unusably slow and produces very poor quality results. But I find this masking a mostly pleasant process, like knitting or whatever, which is very low-key and amendable to music or podcast listening. Take the highly portable iPad anywhere with decent seating, sit down, lean back with the iPad in one hand and Apple Pencil in the other, and mask away. Also, some manual digital painting may be required, and some adjustments at the end can help put things over the finish line (captioned "5." above). Another example of running through this entire proces is show here:

An example image

Another caveat: this requires a SD CoreML model file that matches your style / prompt requirements (oil painting style or pastel painting style is particularly difficult to achieve). This adds another layer of complication with token count limits, in my case 75 positive and 75 negative tokens! In fact this example uses three different SD CoreML model files, which are automatically switched at different crop levels, developed heuristically for my needs.

Recap & Conclusion


I can create very-high-resolution results on very-low-end hardware. This even runs on battery, with no fan. I can overcome signiciant technical and creative barriers using human ingenuity and patience, instead of relying on luck or beefy hardware. It's an interesting challenge to create a bespoke process for one narrow type of image generation.

The code for this is in my Mochi Diffusion more-customizations branch but is very unsupported, and very experimental. An exercise for the reader, if ever there was one. You're on your own. Here Be Dragons.


Humor

True Wisdom

He may live without books,
what is knowledge but grieving?
he may live without hope,
what is hope but deceiving?
he may live without love,
what is passion but pining?
but where is the man that can
live without dining?

Robert Bulwer, Earl of Lytton


Reference

When Anything Can Happen, Nothing Matters

Film theory: an explanation for why so many overproduced movies are emotionally unsatisfying

If you've ever watched a movie where the climax was approaching and the story just started getting too big for its britches, and the filmmakers kept adding ridiculous twists and turns, and upping the ante, and instead of feeling like the movie was getting more interesting, instead it became harder to suspend disbelief and the story started to feel disconnecting, or you began wondering how much longer the movie will take, how much more climactic this already hyped-up scene will get... congratulations, you have experienced what I term,

"When Anything Can Happen, Nothing Matters."

When done right, the action in a story feels personal to individual characters, and when we know the stakes (and the stakes are not just "lots of empty emotionless buildings will get smashed, the whole city is under threat!") then we can feel empathy for specific characters, and we are drawn into the narrative. When the film-makers keep "pulling back" and doing a lot of what Brad Bird called "God shots" with the camera where we are looking down on the action from above, because the spectacle is so vast we have to take the mile-high view, literally, when the focus shifts to spectacle over character, then we are at risk of losing track of why we should care.

A list of movies that suffer somewhat from this problem:

  • Incredibles 2 (literally trying to save the city)
  • Despicable Me 3 (really fun movie with a trumped up, silly, even boring ending)
  • Minions (story gets too big, literally)
  • Penguins of Madagascar (save the city, save everyone)
  • Home (2015 DreamWorks film) - save the planet
  • The Hobbit (fun story, fun story, war campaign, fun story, fun story)
  • The Avengers (2012 film - save the city, punch a building, nuclear threats, endless outsized horrors, and expanding good-guy powers to counter the horrors)

There are many other examples.

An example of a franchise that makes clear rules, in order to avoid this problem, is the Harry Potter books and films. The rules are there to make sure that the writer and reader know that the rules cannot just change at the drop of a hat. Something is at stake. We will not feel emotionally manipulated: "Oh no! Oh no! It's about to get terrible! ... just kidding, the good guys had this, the whole time!" No one wants that.

Note that when this rule is knowingly broken, and when new rules or weird backstories are introduced, but the writers dig in, and explore the ramifications, it can create tension and fun stories, for example the series Adventure Time, and Rick and Morty, which are truly bananas at times, but the stories explore that craziness instead of pouring it on thick and then magically erasing the craziness a few minutes later.

Note also that superhero stories can feel connecting when individual characters are vulnerable, and we can relate to them and imagine ourselves in their shoes. Spider-man and Batman are two examples. Spider-man allows us to wonder what we would do with new powers. And he is always getting banged up and hurt. He struggles even to understand his powers at times. That aspect of a superhero story is relatable and very human. Batman is even better, because he is the ultimate self-made superhero. He created his own superpowers with his intelligence, his gadgets, his study of combat. And he is still very vulnerable. We worry about his fate because he is so human (he is literally not superhuman like Thor or Superman). The opposite is very disconnecting, the struggles of titanic forces against each other: evil gods and good gods fighting at an inhumanely large scale.

Archive