Azulejo Diffusion Reconstruction
Diffusion-based digital reconstruction of Macao-style Azulejo tiles
This page focuses on visual results: img2img reconstruction, text2img generation, and visualizations of structural constraints and style consistency.
Img2img Reconstruction (k = 0.35)
With sampling noise fixed at k = 0.35, we compare four model configurations on two symmetry types. The main focus is whether central / diagonal symmetry is preserved and whether seamless tiling is maintained at tile seams. Click any image to open it in a lightbox.
A. Central Symmetry
Reading hint: under a fixed noise condition (k = 0.55), compare the four configurations from left to right in terms of (i) geometric consistency (symmetry mapping), (ii) topological continuity (seams and borders), and (iii) style consistency (blue–white gamut and brushwork). Pay particular attention to the central cross seam and the agreement between the four quadrants.
B. Diagonal Symmetry
Note: compared to central symmetry, diagonal symmetry is more sensitive to global geometric consistency and typically fails through “diagonal drift” and “seam discontinuity”. This setting is therefore useful to disentangle the effect of different modules: LoRA mainly improves style, whereas UNet Shrink / ControlNet mainly target structure and border regularity.
Quantitative Metrics (single reference, multi-sample)
Metric definitions (sketch):
let x be the generated tile and xref the aligned reference.
• SSIM (structural similarity) is computed on luminance:
SSIM(x,xref) = ((2 μx μr + C1)(2 σxr + C2))/((μx2 + μr2 + C1)(σx2 + σr2 + C2)).
• PSNR (dB) is derived from the mean squared error MSE = ‖x - xref‖22 / N:
PSNR = 10 log10(MAX2 / MSE), with MAX = 255.
• Diagonal symmetry index compares the tile to its diagonal-mirrored version inside a symmetric mask:
roughly SymDiag = 1 − MSE(x, M(x)) / σ2, where M denotes diagonal reflection.
• ΔE*ab is the average CIE76 color difference in Lab space:
ΔE*ab = (1/N) Σ‖L*a*b*(x) − L*a*b*(xref)‖2.
• Mean seam error measures MSE along vertical / horizontal seams after tiling the image in a 2×2 grid.
• Texture style distance is the L2 distance between normalized GLCM (gray-level co-occurrence matrix)
feature vectors of x and xref.
• Color style distance is the L2 distance between normalized Lab color histograms.
• StyleDist aggregates normalized texture, color and CLIP-based semantic distances via a weighted sum.
• CSCI (Cultural Style Consistency Index) uses the style feature space: for each method, we compute the
mean distance to the reference cluster Dref and mean intra-cluster distance
Dintra, map them into [0,1] scores
Cref, Cintra by exponential normalization, and take the geometric mean
CSCI = √(Cref · Cintra).
Text2img Generation (Symmetry Groups)
Here we show text2img results grouped by central symmetry and diagonal symmetry, comparing different model configurations under the same sampling setup. The current version includes SD1.5, SD1.5 + LoRA, and SD1.5 + LoRA + ControlNet. SD1.5 + LoRA + UNet Shrink will be added as a follow-up experiment (work in progress).
A. Central Symmetry
What to compare: focus on the agreement along the central axes and between the four quadrants, as well as boundary continuity (tiling readiness), while checking that the blue–white style remains faithful.
B. Diagonal Symmetry
Work in progress: we plan to add SD1.5 + LoRA + UNet Shrink for text2img as well, to study how UNet shrink affects structural stability and style drift in free generation.
Framework (Data → Model → Evaluation)
Method Core (Structure-aware Diffusion)
Text is deliberately compressed here: the main goal of this page is to show visual evidence. Full technical details are documented in the thesis PDF.
To-do & Future Work
This page is a snapshot of an ongoing project. Below is a non-exhaustive to-do list for experiments and extensions that are completed, in progress, or planned for the next stages.
Completed for this page
- Img2img quantitative metrics (Figures 3–4): CSCI bar chart and multi-metric plots for a single reference tile with multiple samples.
Ongoing
- Text2img: add SD1.5 + LoRA + UNet Shrink configuration and compare with existing baselines.
- User study: expand the subjective experiment, align human ratings with style-space distances, and refine the CSCI calibration.
Short-term experiments
- Img2img noise-level study: systematically compare reconstruction quality under multiple denoising strengths (e.g. k = 0.25, 0.35, 0.45, 0.55, 0.65), focusing on the trade-off between structure preservation, seam continuity and texture restoration.
- Text2img · cultural style consistency evaluation:Design a CSCI-based evaluation protocol that measures how well text2img generations preserve the cultural style of the reference tiles, using central / diagonal symmetry galleries with diverse prompts, seeds and annotated failure cases.
- Dataset: augment the training and evaluation set beyond street signs and floral tiles to include azulejo-style tiles depicting figures (human portraits), landscapes and architectural scenes, and test how well the current pipeline generalizes to these motifs.
Longer-term ideas
- Structure-aware diffusion backbone: move from “SD1.5 + add-on ControlNet” to a diffusion model with explicit symmetry / tiling priors built into the UNet itself.
- CSCI & style-space refinement: extend the cultural style consistency index (CSCI) family and learn a human-aligned style distance from larger-scale subjective experiments.
- Cross-domain cultural patterns: apply the same pipeline to other pattern-based cultural media such as Portuguese azulejo in Portugal, mosaics and textile patterns, and compare style spaces across domains.