This project shows you how to train image recognition on a tiny Arduino Nicla Vision and perform real-time video capturing to create an AI ROCK SBC classifier. Nicla Vision is a tiny board at only 23 x 23 mm square. Its main feature is a 2MP colour camera integrated with a powerful STM32H747AII6 Dual ARM® Cortex® M7/M4 processor, capable of real-time video capturing and image classification for Machine Learning (ML) applications running right on the board.
It also has built-in dual WiFi and Bluetooth Low Energy for network connectivity and an array of additional sensors and connectors, making it a really flexible low-power industrial sensor platform in a small footprint:
Multiple programming options for the board include:
In this project, we demonstrate how to build and test a complete Machine Vision application that has been trained to identify ROCK SBCs when they appear in the video stream.
We take you through the following steps, which can be adapted to your own edge imaging scenario:
You can see a video demonstration in this screen capture of OpenMV – keep an eye on the bottom left of the Serial Terminal to see the print line when a ROCK board is detected!
Before doing anything else, Arduino recommends updating the bootloader, as this may well have been updated since your board shipped. This involves installing the Arduino IDE or using their Cloud service.
We used the latest Arduino IDE version 2.0.2 on an X86-64 PC running Debian Bullseye for this project, but it should run on a Mac or Windows PC, and the installation steps will be similar. See Arduino’s software page.
chmod +x arduino-ide_2.0.2_Linux_64bit.AppImage
cat /etc/udev/rules.d/50-nicla.rules
ATTRS{idProduct}=="035f", ATTRS{idVendor}=="2341", MODE="664", GROUP="dialout"
id
uid=1000(pete) gid=1000(pete) groups=1000(pete),7(lp),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(netdev),115(lpadmin),118(scanner),998(docker)
sudo usermod -aG dialout $USER
./arduino-ide_2.0.2_Linux_64bit.AppImage
That’s how to update the bootloader.
Tip: If you get problems, check the udev rules and unplug and plug your Nicla Vision.
In this step, we install the Open MV IDE programming environment that supports MicroPython on the Nicla Vision. This allows coding Python apps that control the camera to do things like object detection and image classification, trigger outputs and use the other sensors.
./openmv-ide-linux-x86_64-2.9.7.run
You are now ready to connect the Nicla Vision to OpenMV and run a test to make sure everything is working as expected.
import sensor, image, time
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.set_vflip(True)
sensor.set_hmirror(True)
sensor.skip_frames(time = 2000)
clock = time.clock()
while(True):
clock.tick()
img = sensor.snapshot()
print(clock.fps())
Now the Nicla Vision is all set up and working with OpenMV.
The next steps are all about creating an image dataset to build our Machine Vision model with. This consists of capturing video frames from the camera stream and saving them into a data set.
We are going to train our model to recognize when it identifies ROCK 4SE and ROCK 5B boards in the video stream, so we need to capture lots of images of both these, which are our target and also some of the background. So we will end up with 3 data classes, ROCK-4, ROCK-5 and Background.
We will take about 50 images of each board and background from different angles and with different lighting. The more variety, the more accurate your model is likely to be.
To help with the process of creating the Machine Learning model, we will be using EdgeImpulse Studio. This contains a suite of tools used to create the model and generate a C++ library that can be used for inference and image classification, enabling Nicla to identify when target objects appear in its video stream.
EdgeImpulse has free accounts for developers, so visit their site and set one up if you don’t already have one.
Now that the project is created in EdgeImpulse, we need to upload the image data that we captured using the Nicla Vision camera.
You should now have all your data loaded into EdgeImpulse ready for processing. The data will be split into modelling data and test datasets automatically.
This next section is about creating an Impulse. This sets up all the parameters for building the ML model. EdgeImpulse guides you through this one stage at a time and, in the end, gives you an indication of how accurate your model is likely to be. Depending on what you are trying to detect, you may need to iterate through these steps trying different options to get the optimum performance.
Our model was just under 70% accurate, according to EdgeImpulse, but it performed well and was very precise at identifying both the ROCK-4SE and ROCK-5B boards.
The next stage sets the model parameters and generates a feature set from the object images:
Then, select Generate Features and choose All Items – this will trigger a remote job in the EdgeImpulse cloud to generate the model features.
At this stage, you can use Feature Explorer to gauge if the features are sufficiently distinct to give good results. You are looking for distinct clusters of features with as little overlap as possible. Our model is not that well separated because the 2 ROCK boards have some similarities, but it still works well in practice.
Now we can start training our model using the captured images from our training data set.
The training can take several minutes depending on the amount of data and the complexity of the features.
When it’s finished, you get the statistics about how effective your model is, and you can see how the training results are clustered in the Data Explorer view. You can click on the incorrect results and see the image that produced that result.
The final step in the model creation process is to test the model with a labelled test data set to see how well it performs. Some of your data will have been held back for testing purposes, or you can add more images to run the test with.
Another remote job will start and run the test data against the model, giving you the results when it completes. Our test didn’t appear to be that accurate, but it still worked well.
Now that we have a Machine Vision model that works, we can use EdgeImpulse to generate an OpenMV library that can be flashed to the Nicla Vision and be called from a Python script.
This will build the file, then open a file browser to input the download location – you will get a zip archive something like this:
ei-nicla-rock-id-openmv-v2.zip
The next section is a cunning way to build the firmware for the Nicla Vision using a GitHub Workflow process which has been built by OpenMV to automate this process. This means that you don’t need the Arm toolchain installed on your host machine, as the build is done in the GitHub cloud.
You will need a GitHub account for this – they have a free tier for developers.
git clone git@github.com:<your-github-account>/openmv-nicla-test.git
The model library has 2 components included in the .zip file downloaded earlier, the TensorFlow model (trained.tflite) and a text file containing the target labels (lable.txt). We need to replace the original files which are in src/lib/libtf/models directory of the cloned repo with our own model files.
cd openmv-nicla-test/src/lib/libtf/models
rm *.tflite *.txt
cp ../ei-nicla-rock-id-openmv-v2/trained.tflite rock_detection.tflite
cp ../ei-nicla-rock-id-openmv-v2/lable.txt rock_detection.txt
Now commit the changes and push them back to GitHub.
git add .
git commit
git push
When the updates are pushed to GitHub a Workflow starts and builds the new firmware for the Nicla Vision – it takes 2 -3 mins.
The workflow compiles the firmware and publishes it as a GitHub release.
You now have the Nicla Vision firmware with the model built-in.
We can now return to OpenMV and flash the new firmware to the Nicla Vision.
The Nicla Vision will be flashed with the new firmware, which includes the EdgeImpulse model.
The next step is to write a Python script in OpenMV to control the Nicla camera and use the ML library to classify the image stream and try to detect our target objects.
The video stream is just a series of image frames which are passed to a TensorFlow object which classifies the frame using the model and calculates a confidence prediction. If the confidence level is high enough, the label is checked to see which one of the ROCK boards has been detected. The stream is paused while the onboard LED is lit up momentarily to indicate whether it’s a ROCK 4 or ROCK 5. The result is also printed to the output in the Serial Terminal of the IDE.
import sensor, image, time, os, tf, pyb
redLED = pyb.LED(1) # built-in red LED
greenLED = pyb.LED(2) # built-in green LED
sensor.reset() # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565) # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA) # Set frame size to QVGA (320x240)
sensor.set_vflip(True)
sensor.set_hmirror(True)
sensor.set_windowing((240, 240)) # Set 240x240 window.
sensor.skip_frames(time=2000) # Let the camera adjust.
labels, net = tf.load_builtin_model('rock_detection')
found = False
def flashLED(led): # Indicate with LED when target is detected
found = True
led.on()
pyb.delay(3000)
led.off()
found = False
clock = time.clock()
while not found:
clock.tick()
img = sensor.snapshot()
for obj in tf.classify(net, img, min_scale=1.0, scale_mul=0.8, x_overlap=0.5, y_overlap=0.5):
print("**********nPredictions at [x=%d,y=%d,w=%d,h=%d]" % obj.rect())
img.draw_rectangle(obj.rect())
predictions_list = list(zip(labels, obj.output()))
for i in range(len(predictions_list)):
confidence = predictions_list[i][1]
label = predictions_list[i][0]
print("%s = %f" % (label, confidence))
if confidence > 0.8:
if label == "rock":
print("It's a ROCK-4SE")
flashLED(greenLED)
if label == "rock-5":
print("It's a ROCK-5B")
flashLED(redLED)
print(clock.fps(), "fps")
Now all the components are in place to run the Machine Vision demo. Once the camera is initialised, you can use the video frame in OpenMV to point it at the target, and if all goes well, the LED will turn on when one of the ROCK boards is detected.
Here’s the output when a ROCK-4SE board is detected – it only takes a few milliseconds!
Nicla Vision can be quite sensitive to the type of USB cable and even certain ports on the host PC. If you get issues where the device does not enumerate on the USB bus, try a different port or cable.
You can see the device connecting by opening a Terminal and running the following command:
sudo dmesg -w
If OpenMV still does not recognise your Nicla Vision, check if it is enumerated and, if necessary, re-flash the Arduino bootloader.
Never try to ERASE the flash memory – we did this, and it corrupted the bootloader. The only way to rectify it was to use an STM programmer to flash a new bootloader. See this forum for details of how to do it with an STM32F446 Nucleo board (RS Part Number 906-4624).
The Nicla Vision from the Arduino Pro range is an amazing device for industrial Machine Vision applications on the Edge. It’s tiny and very powerful for such a small package with low power requirements.
This project has shown how to set up and build an application that can recognise the difference between 2 ROCK boards, and it is very accurate under controlled conditions, with only a very basic model developed in EdgeImpulse.
We demonstrated how the image classification inference can trigger the onboard LED, but this could just as easily activate the GPIO pins linked to an industrial controller on a production line.
The possibilities are endless! And don’t forget, there is a TOF sensor, microphone, 6-axis accelerometer, WiFi and Bluetooth, I2C connector and battery connector with a monitor as well.
Ref:
What’s your challenge? From augmented reality to machine learning and automation, send us your questions, problems or ideas… We have the solution to help you design the world. Get in touch today.
Looking for something else? Discover our Blogs, Getting Started Guides and Projects for more inspiration!
Da un tocco veloce a schiacciare quel pulsante dell'amore e mostrare quanto ti è piaciuto questo progetto.