Camera coverage was improved by moving from 6 to 10 cameras, then bringing the cameras closer, into a tighter circle. This was key, as it increased overlap in camera coverage across the face rig so multiple cameras could see every marker at all times during the capture.
Camera settings were optimized for close-up, facial capture (lower intensity and exposure and higher threshold to reduce unwanted IR reflections off of the face). I can't remember the exact settings, but it was somewhere in the range of Continuous IR, 3 Intensity, 15 Exposure, 230 Threshold. In large volume applications, you want to edit Exposure first, then Threshold, then IR Intensity. But in close up face volumes, editing Intensity at the beginning is recommended, because the default illumination of the V100:R2 camera is more than is needed for face capture.
During wanding, we made sure that the all cameras had sufficient samples, that the samples covered as much of the 2D camera view as possible, and that the complete capture volume was wanded. We ran the calibration on Medium until all results were exceptional, switched to High until all results were exceptional, then switched to Very High and continued until we had all exceptionals. This calibration yielded residuals on markers well under 1mm. (This technique of stepping the calibration quality up from Medium-->High-->Very High can help it converge if it struggles when starting with Very High. It can be done in real-time, during the calculation phase.)
When doing the facial motion capture, we had to experiment around a bit with marker placement and template construction. Additionally, we had to manually set the template for playback and trajectorization to prevent breakage. You can see a quick video showing how to do this
here.
If you go to the first frame, reset the template, move it into position, then advance forward to the second frame, it should solve. You can then backtrack it to the first frame and trajectorize without breakage.