The design criteria involved scanning the workspace, calculating and executing a path, and controlling a marker on that path to draw an image. Our finished solution successfully met these requirements, implementing a Robo Artist which applies sensing, planning, and actuation. The system senses images by scanning them, and plans by processing the image into contours and producing path waypoints for the Sawyer arm to trace. Finally, the robot arm is actuated and commanded to follow the calculated path while a marker is attached to it, producing a drawn copy of the original image. Our spring based pen holder also effectively stopped the marker tip from becoming too damaged in drawing, although sometimes the Sawyer smashed the holder beyond its travel distance.
Future Features
Beyond fixing the current limitations of our robot, there are several possible extensions.
We could attempt orienting a chisel-tip marker to always draw a wide line, by always ensuring its rotation causes the wide side of the marker to be perpendicular to the contour direction. This would require a new marker holder, but it could help the Sawyer exercise more control over its output. A paintbrush could also be considered, as its output image could also be affected if the brush were to rotate along its axis while painting.
Color drawing could also be implemented to improve the Sawyer's drawing ability. This could be enabled by designing a rotating barrel to slot our pen holders into, allowing the Robo Artist to hold multiple markers, or by having multiple attachments available with a quick-swap mechanism or optimizing the design more for gripper interface. The new attachment would also have an integrated motor to allow for color switching. Ideally, we would have the resources to prototype this pen holder to increase its rigidity and reduce its play within the marker tube as much as possible.
Camera Calamities
Through this project, we encountered plenty of difficulties due to the hardware limitations of the Sawyer system. The wrist camera's field of view was too small -- It was difficult to capture in a single image, forcing us to split capturing the image to copy and drawing out the copy. Furthermore, the wrist camera's focal length was too short. This caused blurriness which combined with the wrist camera's relatively low resolution caused inconsistencies and artifacts in the image processing phase.
Despite these shortcomings, the wrist camera was still better than the head camera for isolating the source image, since we could limit the extra details such as background imagery which would complicate the paper masking. However, there were further complications when we used the wrist camera. It tended to make light objects into white, making it difficult to differentiate where our source paper started and ended especially with the bare table. This required us to get the black table cloth so there was enough contrast to sense the borders of the source picture. We also needed to adjust the exposure occasionally when taking pictures of the source image, depending on the lighting in the room and which robot we were using to avoid images being blown out which was also finnicky sometimes.
Although the head camera had a much better field of view and higher resolution, the angle at which it viewed the table made glare off the AR tags a major issue (exacerbated by our custom tags having been printed with glossy ink) which we remedied by shading the tags. Even then, AR tag detection was very unreliable and getting the robot to detect their positions easily constitutes half of the time, or more, of our time spent in lab while attempting to operate the robots. AR tag detection unreliability also caused noticeable variation in the detected table height. Our marker holder could compensate, but sometimes even that was not enough, requiring us to change the wrist offset from the table, whose hardcoded offset relative to the AR tag average z-height should have be consistent due to the pen holder not changing each time the robot is run.
Another consequence of the AR tracking quality also may have manifested itself in image transform inconsistency. Since our robot begins to execute immediately after both AR tags to locate the paper are detected, any variation in the sensed AR tags' position could cause inaccuracy in where the output image is drawn. We have reason to believe this because sometimes the AR tag's detected location jumped around in RViz, and we even noticed that the robot drew the image in noticeably different places when using the exact same contours, and exact same output paper placement, just executed at different times and therefore possibly with slightly different detected AR marker locations.
These limitations caused by the camera could have easily been mitigated by using another camera, such as the RealSense cameras which were available in lab. However, since there were many projects and a limited number of cameras, we were recommended to use the Sawyer's integrated cameras instead of a RealSense. If we had more time and resources, we would have used one and build a transform for it to the camera base. With more time, we could have also tested more AR tag patterns beyond 6 and 0 to determine which patterns could most reliably be sensed by the Sawyer, as that could have had an effect on AR tag tracking reliability as well.
Stopping a run because the AR tags were suddenly lost
Varied output due to AR tracking issues
Transform Tribulations
Throughout the project, one of our key goals was to ensure that the robot can successfully draw an image on a output paper that is placed and oriented arbitrarily (of course, within the Sawyer arm's reach). We had several solutions to this; the first one was a script that did not work at all. The second one we tried was ChatGPT-assisted. It successfully transformed the image contours to the correct location of the output paper when the paper was placed in landscape orientation. However, even after many revisions it did not output rotated images properly which baffled us since the code seemed like it should be rotating input images by a particular angle that was defined by the location of the AR tags' real locations.
Finally, we independently developed an algorithm using rotation matrices and linear translations. We made all the reasonable assumptions we could, using our known paper dimensions and our known AR tag dimensions to build constants for shifting and scaling the image properly to account for the AR tag dimensions. We also ensured we understood the Sawyer base's coordinate system, which is the reference from which we were locating our AR tags and that its forward (facing the computers) direction was indeed X, and its left side was indeed Y. We built a Jupyter notebook to model the algorithm on sample AR tag locations, ensuring it could reasonably transform sample points. This resulted in perfect rotations - with our drawing being in the same orientation relative to the paper, no matter how the paper was rotated - but had some translation issues leaving images not centered on the paper. We deduced that the translational offset from the sensed AR tag locations was to blame, but could not understand why.
Ultimately, we used the last algorithm with some trial-and-error deduced constants for shifting the image, which ended up recreating the image well, within a centimeter of translational and scaling error even after rotating the output paper at an arbitrary angle. While the constants could use a bit more fine-tuning, the transform works for the most part and much of the error can be attributed to sensor inaccuracy in the output paper locating method, as described above.
If we had more time, we would attempt to further understand why the transform code behaves this way, and more thoroughly verify our assumptions when building the transform code. From there, we would make our transform code more mathematically robust, instead of fudging constants.
An example contour shown to be aligned with the paper's orientation (4 small dots at the corner) in our transform simulator built in a notebook
Notice that the miku is ever so slightly off to the left
Hardware Hardships
In addition to the cameras, the actual robot arm may have some limitations in precision due to deflection in the arm which cannot be accounted for in controls, since it comes from the structure or mechanisms flexing or having play. We observed that even small forces applied to the arm can cause it to move while the actuators are holding it in place and cause deflection in the arm's path while moving. We attempted to mitigate this by again, selecting very soft springs in the marker holder to minimize external forces on the arm, and our trajectories and placements ultimately seemed reasonable.
Our pen holder also had peculiar durability issues. Our very first print was very flexible and easily handled several assembly and disassembly cycles until it was crashed into the table at an oblique angle and fractured. Subsequent attempts at producing the holder resulted in cracked mounting slots, every time. We used PLA each time, so it was strange as to why we could not achieve the same quality as the initial holder; ultimately we did not figure this out before the project ended. If given more time, the reason could be down to printing parameters (i.e. flow rate) or even down to the type or manufacturer of the PLA filaments.
Ultimately, after we refined our control algorithm, we noticed all of our holders were broken. To demonstrate the new controls, we just shoved the marker tube into the base in the opposite orientation, which worked thanks to a tight fit, but is a much less elegant solution.
If we had more time, we would investigate why our print quality was so inconsistent, and print our holders to optimize flexibility and toughness based on the findings. Furthermore, we could have tried an even greater range of springs or even no spring in the holder, to see its effect on print quality since we noticed pen on paper forces could have contributed to jagged contours due to friction.
Small applied forces on the sawyer arm can still cause noticeable deflection
Broken slot tabs on a marker holder
For our final videos, we ran the broken marker holder like this
Software Sorrows
Our prior software stack followed the sequence of plan, execute, plan, execute, etc. for each point in a contour, causing the drawing process to be slow and piecewise.
Eventually, we figured out we could use MoveIt's compute_cartesian_path to generate a smooth movement instead of moving to each point and stopping. This solution was implemented and demonstrated on several example drawings we produced after Demo day, which are shown on this site.
Some of the jaggedness on the Sawyer's actuation did not go away even with this new method, particularly on smaller radius contours. This could be due to the movements being so small that we are observing the Sawyer's actuation resolution, or due to friction between the marker and the paper causing the marker to slip. It could also be due to deflection in the Sawyer arm causing oscillations as forces are exerted when the arm changes direction.
If we had more time, we could have tried more robot speeds between our final speed of 20% and 100% to see if we could maintain good accuracy while maximizing speed and/or smoothness. In this case, the inertia of the arm could help 'fill in' the roughness in motion from fine robot actuation.
The output image is jagged