Utilizing  pre-trained networks
for tasks of architectural image
generation and recognition. 

Advisor/   Sean Hanna and Kahlid El-Ashry



Image 01 -  A method of viewing how the network is understanding the similarity of the images after being trained. Architectural imagery on the right and architectural drawings on the left. Interestingly sections tend to be in the middle of this diagram.


Training deep learning networks to distinguish images is well documented practice to take general tasks – teaching the machine to tell the difference between different types of dogs or types of flowers. This is accomplished through feeding the network a surplus of well-structured labeled image data. For tasks with sparse and limited data we can use networks trained on large images and general image- pre-trained networks - to transfer their knowledge and to accomplish a similar task with less training time and data.

Unfortunately, data that distant and grammatical in nature such as languages or architectural drawings presents a further challenge due to what is referred to by Jeffery Elman as the projection problem. That the example data might not be enough information for the network to generalize and understand the underlaying logic of the data. Furthermore, in the process of transferring knowledge it relatively unknown where and how the network adapts to these new domains.

The attempt of this paper therefore, evaluates to what extent can pre-trained networks help reduce the information needed in order to distinguish architectural plans from sections and where in the network the network adapts? Surveying methods of domain adaptation to reduce the distance from source domains to target domains. This is accomplished by shifting or Fine-Tuning the CNN feature space from large trained datasets using auxiliary datasets. Additionally, we introduce a method to use generative adversarial networks GANs to create data to either use the created content to feed into a typical CNN pipeline or use the discriminator models’ weights to transfer and train a new model. The results suggest that in order to get the network to generalise spatial patterns in architectural drawings, requires a diverse set of high-level features that can either be acquired through similar low-level features or through directly through domain specific creation. 


Image 02 – ALV Kernel _ An Image of the first layer inside a Network, depicting how the network is filtering and breaking down the image for a recognition task. We can see that at this stage the image is being understood for its contours. The color tone variance refers to what the excitatory in white and inhibitory in black. Input image of flower and dog





Image 03 – A comparison of network layers pre-trained depicting how the layers get into finer and finer details and features the deeper into the network. The layers are more similar due to the pre-trained developing general biases. Input image sunflower, dog, sketch of cat and architectural CAD plan.


Image 04 – A comparison of network layers trained from scratch depicting how the layers get into finer and finer details and features the deeper into the network. Input image sunflower, dog, sketch of cat and architectural CAD plan.