At NVIDIA GTC, it’s a tradition to release new products, solutions, and services. In that tradition, NVIDIA launched a new version of the Jetson Nano, a tiny ARM-based computer with a real GPU and capable of doing magical things. It’s basically a Raspberry Pi on steroids. I was lucky to receive one of this new model, The Jetson Nano 2GB, to test it and share my experiences here.
Why the NVIDIA Jetson Nano?
I have used the Jetson Nano for a variety of things. The main goal was to get started on AI and work on my coding skills. There is a free one-day course on NVIDIA’s learning platform DLI (Deep Learning Institute) that gets you started and perfectly explains the process of training, validating an using a model for object recognition purposes.
Since the Jetson Nano has 128 CUDA cores, a QUAD-core ARM CPU, 1 GbE, and has a low power-consumption, it is also the perfect device for IoT projects that require a bit more computational power. In my case, I wrote an app that does image recognition on an RTSP stream coming from security camera’s. A Raspberry Pi simply can’t run the app because it’s not powerful enough, a device like an Intel NUC or other device is suitable, but has an overkill of resources. So, again a perfect use case for the Jetson Nano.
Why a new model?
The old model was awesome, but there were some challenges that NVIDIA has now solved 🙂
Let’s first look at the specs of the new model:
|GPU||128-core NVIDIA Maxwell|
|CPU||64-bit Quad-core ARM A57 (1.43 GHz)|
|Memory||2 GB 64-bit LPDDR4 (25.6 GB/s bandwidth)|
|Wireless Connectivity||Available via an accessory 802.11ac wireless adaptor|
|USB||1x USB 3.0 Type A ports, 2x USB 2.0 Type A ports, 1x USB 2.0|
|40-Pin Header||GPIOs, I2C, I2S, SPI, PWM, UART|
|Camera||1x MIPI CSI-2 connector|
|Storage||MicroSD (Card not included)|
|Other IO||12-pin header (Power and related signals, UART)|
4-pin Fan header (which my model didn’t have)
|Size||100mm x 80mm x 29mm|
When comparing to the old model, I think these are a couple improvements:
- The price has dropped to 59 dollar for a developer kit, which will probably translate to 59 euro in Europe.
- The new model has WiFi support, where the old model just had 1 GbE support.
- If you want to fully utilize the GPU on the old model, you need a separate (traditional) power adapter so enough power will be available to run your AI app. The new model just has USB-C, which is sufficient to do whatever you want on the device!
I also found some weird things:
- The new model has 2 GB of RAM instead of 4 GB. I honestly don’t know why this is, but when looking at bigger apps or models, this might not be sufficient. I also played around with K8s on the old model, which, if you run multiple containers next to each other, could be insufficient.
- The model I have doesn’t have a fan connector. The wiring is there, but there are no pins. During the benchmark I ran, I could fry an egg on the heatsink. So, I honestly don’t know why it doesn’t just come with the pins for the fan header.
How does it perform?
To get started on the device, I loaded the Jetson Nano developer image, which will be available on the NVIDIA developer site soon. The initial boot and configuration took around 60 minutes to complete. After the boot I directly start with the benchmark.
To get an idea on the performance of the device, I ran a ResNet50 benchmark test to get the FPS of the model.
root@jetson-nano:/usr/src/tensorrt/bin# ./trtexec --output=prob --deploy=../data/googlenet/ResNet50_224x224.prototxt --fp16 --batch=1 output: prob deploy: ../data/googlenet/ResNet50_224x224.prototxt fp16 batch: 1 Input "data": 3x224x224 Output "prob": 1000x1x1 name=data, bindingIndex=0, buffers.size()=2 name=prob, bindingIndex=1, buffers.size()=2 Average over 10 runs is 27.0601 ms (host walltime is 27.1189 ms, 99% percentile time is 27.1436). Average over 10 runs is 27.0688 ms (host walltime is 27.1243 ms, 99% percentile time is 27.1591). Average over 10 runs is 27.0208 ms (host walltime is 27.0704 ms, 99% percentile time is 27.1617). Average over 10 runs is 27.0356 ms (host walltime is 27.0869 ms, 99% percentile time is 27.1166). Average over 10 runs is 27.059 ms (host walltime is 27.1139 ms, 99% percentile time is 27.1466). Average over 10 runs is 27.0114 ms (host walltime is 27.0615 ms, 99% percentile time is 27.1533). Average over 10 runs is 27.0663 ms (host walltime is 27.1186 ms, 99% percentile time is 27.1596). Average over 10 runs is 27.0332 ms (host walltime is 27.0966 ms, 99% percentile time is 27.1498). Average over 10 runs is 27.0304 ms (host walltime is 27.0901 ms, 99% percentile time is 27.1011). Average over 10 runs is 27.0492 ms (host walltime is 27.107 ms, 99% percentile time is 27.1231).
You can see that the Jetson Nano 2GB model has a 27.1 FPS average.
I have ran the same benchmark on the old model, but there is no difference in the output. I guess that’s mainly due to the fact that the CPU and GPU are exactly the same.
The new model has the same GPIO header as the old model. This ensures you can utilize all of the sensors and devices you might already be using on the old model without any issues.
The 2GB model only has a single CSI connector, so you can connect a camera like the Raspberry Pi camera on the board through the native connector. Unlike the 4GB model, the 2GB lacks an M2 slot.
Lastly, the 2GB model only has an HDMI port, instead of HDMI and Display Port on 4 GB model. This isn’t really an issue, because I think most of us will use it headless anyway.
I think the Jetson Nano 2GB is a great board to start your own AI project and all of the support and the ecosystem around it. The lowered price enables for many more to kickstart their own AI an IoT journeys, but it might come with some limitations. If you don’t need al of the connectors that the 4GB model has, the new NVIDIA Jetson Nano 2GB is a great alternative. The new 2GB model should be GA at the end of October. If you can’t wait, the 4GB is also available 🙂
If you want to know more about the device, here are a couple of interesting links: