A Shocking Bug

Diamond Kinetics Dev Team
4 min readAug 14, 2024

--

Swings, Bluetooth Low Energy, and Crystal Oscillators

Jesse Bahr, Embedded Software Engineer

Diamond Kinetics app and Bat Sensor
Diamond Kinetics app and Bat Sensor

Before the release of the Diamond Kinetics (DK) mobile app in April 2023, the team uncovered a severe problem with the app disconnecting from our bat sensor, which uses Bluetooth Low Energy (BLE) to connect wirelessly to an iOS device. This appeared to be happening every 5–10 swings, stalling the flow of a batting session and resulting in a subpar user experience. As we had previously shipped our legacy SwingTracker app and not observed this behavior, we were initially surprised to witness this in a new app built using the same code both on the mobile app and embedded system. As it turned out, we found that the previous app masked this issue by actively reconnecting, a mechanism not yet integrated into the new app. With the bug uncovered, it seemed prudent to find and fix the root cause of the disconnect before applying the automatic reconnect mechanism. This effort led from the mobile app down into the depths of the sensor’s embedded software, where we found an unfamiliar and unlikely root cause.

To start, we followed debug steps that you’d normally expect, setting breakpoints, stepping through code, and checking memory before, during, and after the disconnect. This proved difficult on the sensor side since faking swings through software and/or basic impacts with a bare board connected to a debugger did not reproduce the issue. I ended up rigging it up inside of our high-performance mount on the end of a bat with a few cables sticking out to reproduce. Ultimately, the debug results from this pointed to a timeout of the BLE connection on both sides of the connection, which did not reveal a specific problem.

I then checked the BLE traffic with Wireshark and the Nordic debug board. This showed that BLE messages coming from the sensor would drop off suddenly during the swing and leave the mobile app for a short period without response. This was confusing since the sensor reported the disconnect reason as a timeout and no fault or reset occurred on the main processor. At this point, I began to worry that there was something more fundamental going beyond just a simple bug. To check this, I disabled all code on the sensor except for code that maintained the BLE connection, which still resulted in a disconnect during a swing.

Around the same time, we found that the issue did not present itself for sensors embedded inside the knob of the bat, only for sensors connected to the knob with our rubber high-performance mounts. This was starting to boggle our minds.

A Diamond Kinetics bat sensor mounted in a high-performance mount
A Diamond Kinetics bat sensor mounted via a high-performance mount
A Diamond Kinetics sensor embedded in the knob of a Marucci Cat X Smart Bat
A Diamond Kinetics sensor embedded in the knob of a Marucci Cat X Smart Bat

With no logic triggered by swings or motion and this issue only occurring in our high-performance mounts, a physical phenomenon seemed like the probable cause. So, in addition to disabling code surrounding MEMs devices, I also removed all MEMs devices on a board and still reproduced the issue. I also replaced the SMT BLE antenna with an alternate to no avail, obviously grasping at straws.

I then determined to test with some example code from the main MCU’s manufacturer. I looked at the config differences in the example and our code I saw the use of the internal resistor-capacitor low-power clock source in the example as opposed to our use of an external 32KHz crystal oscillator. I changed our application’s configuration to use the internal resistor-capacitor clock source and could not reproduce the issue.

What weirdness! How could the clock be the problem? I looked online and found support for this possibility in multiple articles and studies about the effect of shock and vibration on crystal oscillators. I also verified that while the BLE subprocessor wasn’t driven by the low-power clock, some of the interval timing functions were in fact using that clock source. We found our smoking gun!

Fortunately, our main processor performs regular calibration of the clock with the internal resistor-capacitor oscillator as the clock source, which gives us confidence in our time tracking over voltage and temperature changes. We were also able to switch the low-power clock source before going to sleep since the calibration of the clock with the RC source does not occur in deep sleep. We may eventually replace the external 32KHz oscillator with a clock source that is not susceptible to shock and vibration, but for now, we have a working solution for all of our devices, including those already in the field.

I want to go back and capture the output of the external oscillator during this phenomenon but have not yet done so. However, I did take some oscilloscope captures of voltage and current across the device during my debugging efforts, which provides indirect evidence of the timing failure. The image below shows that the 15ms connection interval becomes inconsistent and the device returns to advertising.

Thanks to Jesse’s diligence and efforts the Diamond Kinetics app and Bat Sensor have been shipping successfully since April 2023. Our sensor connectivity issues, which had been hard to pin down and seemingly very transient have all but disappeared. This allowed Diamond Kinetics to partner with Marucci to launch the first true Smart Bat in late 2023.

--

--

Diamond Kinetics Dev Team

The engineering team at Diamond Kinetics, the Trusted Youth Training Platform of Major League Baseball