BIM — information mining, processing, and visualization

14 min readApr 5, 2020

I was always trying to organize data on each project I worked on. My original post was about electrical prefabrication using BIM technology where I talked about information processing within 3D engineering and how it was helping me and my team achieve goals. Today I want to dive a little deeper into this topic and be more specific about which software tools and methods can be implemented.

What you will learn

Insights of your team’s activities and project progression
How to report targeted model conditions
How to extract detailed model data quickly and efficiently
Develop a plan to bring coding talent into your team
Understand what tools are available to help harness the data generated by your teams and projects

Introduction

Even though the construction process and the design process are getting closer, they’re still pretty far away from each other. Today there are many projects running with BIM support, but the value BIM technology can provide is much higher than what projects actually utilize. Some on-site Project Managers use BIM more extensively than others, but still not enough to get those really big returns on investments.

BIM can provide a forward shift on planning, problem solving, and constructing projects, that drives increased safety, production, customer satisfaction, and more opportunities for people. It sounds very general and fluffy, but it’s true. With BIM tools at the forefront of our projects we can ensure we are involved with like-minded, progressive construction managers and owners. We can significantly reduce the design/bid/build process. Through our early involvement and collaboration we can provide value to the end user by teaming up with other subcontractors who are willing to put “skin in the game” and come out at the end with a satisfied customer who will repeat business with our team.

It would be nice if we can all get together (on-site team, prefab team, managers, and VDC/BIM team) and talk about everything we can do for each other to make life easier. Apparently, “everything” in the construction industry is a lot. There are so many processes, crews, documents, items we need to discuss. We can spend weeks developing an ideal workflow with all departments involved and never get to the bottom of it.

On the VDC/BIM side, as we work through projects, there’s a constant generation of information. Unfortunately, on-site teams almost never get to see it. On the other hand, there are a lot of different and important processes going on the construction site that we, as a BIM team, will never see.

The project management personnel who are managing the construction often do not have access to the schedule in its native form, nor do they know how to use the software that produces it. No matter what you do the manual visual observations and traditional progress monitoring based on field personnel’s interpretation are always going to be time-consuming, error prone, and infrequent. The manual process does not allow capture and analysis enough, in fact we are thereby missing many opportunities to inform future projects or help with current ones.

This issue becomes more relevant with manufacturing. Today BIM teams bring the design and try to make it modular. Bringin prefab assemblies earlier on the construction is allowing team members to identify building components. The ability to generate detailed models of prefabricated elements should boost productivity, but it doesn’t. Specialists don’t know how to transfer project data from phase to phase, they don’t know what to transfer. In the construction business you can never be certain. RFIs and COs, cut sheets on equipment — all these things need to be processed and implemented into the project.

***Picture 1. Project life cycle (picture is taken from unknown source).***

As you know, there is a lot of information we need to take care of. And I’m sure every Project Manager has a backlog to track all project changes. But at some point it’s going to be a lot of information, so you won’t be able to track it all. For this reason, we need to think about how to collect data automatically. Collecting data — this is what will help understand current workflow, and eventually, improve the communication between BIM, field, and management teams. No matter how good the coordinated model is — the building is not going to build itself. For this reason we need to take into consideration the field team workflow.

By analyzing data we are going to determine where the project needs additional support. Apparently, long term data capturing plays a big role in the future. This is why I’m working on a short and long term data capturing for project planning, estimating, and prognosis.

In this post I will explain several techniques for getting information out of your models and into a format that you can more easily digest and visualize, all with the goal of helping you make better decisions. I will also show tools I’m using for visualization.

From the practical standpoint, you can improve

Built a small database with the most common construction items.
Reduce the amount of non-essential paper flow.

Stack

MsExcel, MsAccess, SQL Server, BlueBeam, Revit, AutoCAD, Fusion 360, Navisworks, Dynamo, Power BI, Templates: *.rvt, *.rfa, *.dwg, *.csv

What was captured and methods

You might be wondering what kind of data I’m trying to capture. I am trying to catch data from the most common programs in the construction business. The idea is to get as much data as we can to paint a full picture of the daily workflow. Engineers, drafters, prefabrication, on-site team — every department can find something important for them to use. We don’t want to filter anything out, because the more pieces of software we add, the less custom the process will be.

On the BIM team side there is data from your AEC tools. Revit, Dynamo, Navisworks, AutoCAD. The list of parameters is quite long. We can start from the very basic stuff like project_name, project_phase, project_address and so on and so forth. Here is the systematic approach composed of a detailed step-by-step procedure developed to mine design logs in order to monitor and measure the productivity of the design process:

The first and most basic technique for getting information out of your models is exporting Revit schedules, quantities into a CSV file that can be opened by MS Excel. It does require excessive manual work and the data will be relatively limited. There are a number of descent tools you can use to make Revit->CSV data extraction easier. With Ideate BIMLink, the user can pull information from a file into Microsoft Excel and push volumes of precise, consequential BIM data back into your Revit model with speed, ease, and accuracy. Data management tasks and workflow take a small fraction of the time they once took. The cumulative advantage means more hours freed up. You gain unprecedented access to the Revit modeling data you need, for an enhanced workflow.

I highly recommend having the database connection directly to the Revit project. You can use Revit DB Link to export Revit project data to the database, make changes to the data, and import it back into the project. The database displays Revit project information in a table view that you can edit before importing. This table view also allows you to create Revit Shared Parameters which adds new fields for those parameters in the related tables. Any change made to these new fields within the database updates Revit Shared Parameters upon future imports.

Dynamo has been used for the most flexibility and potential in what types of data can be extracted from the Revit model and in terms of what can be done with that data. I’m using Dynamo scripts not only to extract the information I need but also to know how many times a certain script has been used.

Dynamo extension enables users with powerful data-mining capabilities through a graphical user interface. These capabilities, once only available to Revit software’s API experts, have made it easier to get to your Revit software data, to manipulate it, and to stream it to many external sources (Excel, Access, Microsoft SQL Server, MySQL, or SQLite) using the standard nodes or packages. The main advantage of Dynamo is that it is easy to use and that apart from the multiplicity of platforms it also allows us to filter the information and export to many different data serialization formats like XML, Json, HTML, CSV. Together with Python scripting and C#, it can access the Revit API to provide the most versatile option here.

At the same time, If you create a lot of Dynamo scripts within your company you might want to know who is using them and how often. To capture data like Script_usage_person, Script_usage_times I wanted to add a few nodes. I first thought about having each script send an email to a dedicated gmail account with any associated data. But I did some research and it seems to not be a good idea, because something changed with Google’s authentication, so this feature isn’t working. Another idea is to set up a database and then send it a pocket of data every time the user opens a given definition. It requires a little coding, but It’s a lot less disruptive than receiving a message every time somebody runs a script.

Along with what is in the model we are also going to parse Revit’s log files. We are able to catch additional data from that source. They have a lot of information that will allow you to fully understand the state of your digital assets and behavior of users.

Since I’m using Navisworks for QC purposes and coordination, it would be necessary to collect viewpoints (quantity and type) and the information about model update (dates). Here is what the typical clash detection process looks like.

To automate the data collection process I’m reading XML export files with a Python script. Basically I’m exporting viewpoint data and saving it on my Database. There are a number of different methods that can be used. I suggest ElementTree. There are other compatible implementations of the same API, such as lxml, and cElementTree in the Python standard library itself; but, in this context, what they chiefly add is even more speed — the ease of the programming part depends on the API, which ElementTree defines.

On the fabrication team side we are capturing how much time it takes to build one assembly and what the item costs. With the time tracking it’s pretty clear, though the item cost wasn’t as obvious for me in the beginning. Most of the Project Managers or Shop Supervisors are using this for estimating reasons. There are a number of desktop and web applications that allow you to quickly run the numbers and roughly find out the cost of all the items and supplies. Those applications have a lot of items, but don’t have the items you and your teams are using on a daily basis. For this reason, all my calculations needed to be adjusted. On the other hand, the Floor Supervisor in the shop is tracking materials in house — this person knows what he is buying or what he is going to buy in advance, so it’s easier to build your own database and estimate new projects off of it.

These days Bluebeam can be considered as an industry standard. You may know Bluebeam Revu as the software that lets you markup PDFs and collaborate digitally, but did you know that Bluebeam Revu can also save you some time on administrative tasks as well? Bluebeam Revu keeps track, classifies and organizes PDF markups in the Markups List. Exporting a Markup Summary to create an easy-to-read report that can be saved and shared as a PDF or CSV document. For a while I was exporting XML data from a Revu summary into an excel spreadsheet that was mapped for the XML. We had a specific form that guys in the field used to figure their homeruns and we didn’t want to change it at the time. Once the XML was imported into excel, the values on the form were set to =$1A1 (for example) and had the form auto-fill that way instead of handwriting everything. Works well, but takes a little extra time up front to set up. Once set up though it only takes a minute for all future summaries.

During the course of Projects there will be items that fall outside of the Design Team. These are things that create a positive impact to the site onsite team. Many of these items will be identified in the planning stage and be put into the Project Team’s Scope of Work. Some will be “on the Fly” and the Field Staff will upload these items through a database.

Wait, a database? Yes. If you are able to implement Google Forms into your workflow, then you can save all the data into Google Sheet, which can be used as a backend database. Having this database I was able to save some data, better understand the on-site workflow. Of course you want to eventually transfer all the data into a real database. In my case it was SQL Server. In a few months I was able to create something like an ordering system when an on-site specialist can order necessary assembly from a prefab shop.

All the typical stuff has been processed via database and uploaded on the mobile app. Assemblies that have been designed in 2D and 3D (AutoCAD, Revit, Fusion 360) and built in the shop we were able to share via the app with the field team.

What you can get out of it

What can you do with all of that data? I don’t think there is a single recipe that will fit all companies out there. That’s the idea. We don’t want to tell you what to do with it. We are interested in allowing you to come up with your own questions. We are here to give you the tools that will back your stories with data. We are here to help you understand it, and visualize it. At the end of the day, this is your data, and your story. Based on experience here are the most common exercises I went through in the last couple of years.

Revit log results indicate the following:

Each designer executes specific commands more than any other commands. As an example, it is shown for a designer that the cumulative frequency of three commands can reach up to 60 % of the entire number of commands executed by the designer.
A particular sequential pattern of design commands (“pick lines” → “trim/extend two lines or walls to make a corner” → “finish sketch”) has been executed 1000 times, which is 50% of instances associated with the top five discovered sequential patterns of design commands.
The identified sequential patterns can be used as a project control mean to detect outlier performers that may require additional attention from project leaders.
Productivity performance within the discovered sequential patterns varies significantly among different designers.

This math matches the conclusions from BIM Log Mining: Measuring Design Productivity article (written by Limao Zhang and Baabak Ashuri).

Navisworks log results indicate the following:

For a typical commercial project, bringing the Fire Protection trade model gives you more clashes with Mechanical then every other model.
To minimize the amount of clashes is to set up standard rules before starting modeling Plumbing.
Constructability issues have to be resolved as soon as possible. Constructability issues will definitely cause more clashes If you keep them in a backlog.

As I said earlier, it has to be something else (another piece of software, bluebeam template, google form, survey and so on and so forth) developed for the on-site and prefab teams to be able to collect data. With that being said, let’s try to answer the following questions.

Scenario #1: Let’s say that you shared with your field team a list of assemblies throughout your app that you think are critical to take full advantage of. How do you know that the assemblies you build are the right assemblies that people actually use as opposed to assemblies that are slowing them down. What if you could track how many items from these assemblies are being used for the whole company?

Looking at the above chart, I could easily consider getting rid of 4”sq box assemblies. Why deploy these to the whole company if no one seems to be using them much? With this knowledge you can shave valuable seconds of time that it takes to create or adjust these assemblies over again in your database.

Scenario #2: Let’s say that you have a trove of “approved” assemblies that you made available via the app. Wouldn’t you want to know how much time these are taking to build?

Having the ability to quickly see how much time it takes to build those assemblies allows you to dedicate time to them. For most companies, being able to focus some of their most valuable and expensive resources on things that are actually being used, means huge savings. Knowledge is time, and time is money.

Scenario #3: Now that you know what assemblies are being used, wouldn’t it be nice to know who’s using them?

You are investing time and money to create, maintain, and deploy these scripts. You spend time and money training your staff to be able to use them. It only makes sense to find out who is fully taking advantage of all that time and money spent on your digital assets. With this kind of data, you can easily pick their brain for potential improvements, but also figure out who might need some extra training and encouragement.

In addition, a few words about export from database. In the world of industrial automation, CSV (Comma Separated Variables) file formats are very common and an excellent way to transfer data between systems. I can’t say enough about how much time we were able to save on extracting data into CSV files for all kinds of automatic and semi-automatic machines in the shop (benders, cable and strut cutters).

Visualization

Once BIM data was accessible in DB SQL Server, the last step of automation was to prepare the environment for publishing dashboards developed in Power BI. See the diagram below.

As you can see some of the work has to be done manually (creating tables in SQL for example). But obviously, you are not going to do it on a daily basis, most likely monthly. Beyond that, I was able to make a pretty much automatic data visualization process.

Wrapping up

The idea that you need advanced help to collect, automate and visualize the AEC data is very hard to prove to a higher up management. Obviously, it requires a little bit more work and knowledge, depending on the complexity of the desired solution. On the bright side, with this option you are not limited to exporting data into just Excel files. It is possible to store the data directly in some kind of SQL database and, eventually, integrate it directly into some other software.

You can create applications for people in the field, anything that could get them closer to the prefabrication and design process, to make the communication process easier.

Templates are one of the most important parts of the smooth workflow. No matter how good you are at modelling or managing — a well organized project will save you a lot of time in the future.