We're an FDA regulated medical device startup, with a pretty low budget for the moment. Our current setup is two pronged, in-house, and automated.
The first piece is the specification documents, which are simple word docs with a predictable format. These cover how the software SHOULD be implemented. From these documents, we automatically generate the mission critical code, which ensures it matches what we say it does in the document. The generator is very picky about the format, so you know right away if you've made a mistake in the spec document. These documents are checked into a repo, so we can tag version releases and get (mostly) reproducible builds.
The second piece is the verification test spreadsheet. We start this by stating all assumptions we make about how the code should work, and invariants that must hold. These then are translated into high level requirements. Requirements are checked using functional tests, which consist of one or many verification tests.
Each functional test defines a sequence of verification tests. Each verification test is a single row in a spreadsheet which contains all the inputs for the test, and the expected outputs. The spreadsheet is then parsed and used to generate what essentially amounts to serialized objects, which the actual test code will use to perform and check the test. Functional test code is handwritten, but is expected to handle many tests of different parameters from the spreadsheet. In this way, we write N test harnesses, but get ~N*M total tests, M being average number of verification tests per functional test.
All test outputs are logged, including result, inputs, expected outputs, actual outputs, etc. These form just a part of future submission packages, along with traceability reports we can also generate from the spreadsheet.
All of this is handled with just one Google Doc spreadsheet and a few hundred lines of Python, and saves us oodles while catching tons of bugs. We've gotten to the point where any changes in the spec documents immediately triggers test failures, so we know that what we ship is what we actually designed. Additionally, all the reports generated by the tests are great V&V documentation for regulatory submissions.
In the future, the plan is to move from word docs + spreadsheets to a more complete QMS (JAMA + Jira come to mind), but at the stage we are at, this setup works very well for not that much cost.
Thanks for sharing. I just realized your use case could fit quite neatly into Inflex (my app, not open for signups, but has a “try” sandbox), a use case I never considered.
I have an unexposed WIP type of cell that is a rich text document. The rich document you edit has corresponding code, which you can also edit in the other direction. (screenshot https://mobile.twitter.com/InflexHQ/status/14923564133263360...) The neat part is that I added the ability to embed a “portal” to display another cell, such as a number or a table, which can be edited inside the rich text doc or at the cell’s origin. For tutorials, I figured it would also be nice to display the source code of a given cell in a rich editor. E.g.
The formula [plasma * 3.23224 + alpha] yields [7.289221].
You could edit the text (The Formula ... yields ...), or the formula and see the result.
Finally, the rich editor is already in a format ready to be printed as a PDF or Word doc.
Also, the source code being the source of truth means it’s very easy to version the whole system down to SHA512 hashes.
This could unify a use-case like yours, where you have the Google Doc and the Google Sheet bridged by Python, this would cut down that iteration feedback loop.
Being content addressable means that only tests that need to run would be run (see Unison lang), rather than running all tests every time, further cutting down on the feedback loop.
One question that comes to mind is: what if you could export code in the spreadsheet as a general purpose language like C or Python? Or even Lua? Also, would an on-prem/desktop version of the product would be valuable for this use case?
Thanks for the food for thought. This is a niche I never considered! And I’ve worked on a medical device for Amgen before! I forgot all about this.
The data in the spreadsheet is really the most valuable part! I like the idea of using a small DSL to manage this, we actually already have some (very simple) DSL-like things in the spreadsheet to make it easier on editors of the spreadsheet.
Of course, with systems like JAMA, that automatically ingest Word docs and extract requirements, even to the point of generating verification tests, in turn kicking off dependency updates, the development loop becomes quite tight.
We definitely have to improve on the process that exists right now, as it's quite cumbersome to get all set up, and won't really generalize to new devices, but it's been a great learning experience, and I think it's really improved our development process overall!
Definitely a very complex problem to automate in the general case, makes sense why the JAMAs and Greenlight Gurus of the world can charge ungodly sums per seat.
The first piece is the specification documents, which are simple word docs with a predictable format. These cover how the software SHOULD be implemented. From these documents, we automatically generate the mission critical code, which ensures it matches what we say it does in the document. The generator is very picky about the format, so you know right away if you've made a mistake in the spec document. These documents are checked into a repo, so we can tag version releases and get (mostly) reproducible builds.
The second piece is the verification test spreadsheet. We start this by stating all assumptions we make about how the code should work, and invariants that must hold. These then are translated into high level requirements. Requirements are checked using functional tests, which consist of one or many verification tests.
Each functional test defines a sequence of verification tests. Each verification test is a single row in a spreadsheet which contains all the inputs for the test, and the expected outputs. The spreadsheet is then parsed and used to generate what essentially amounts to serialized objects, which the actual test code will use to perform and check the test. Functional test code is handwritten, but is expected to handle many tests of different parameters from the spreadsheet. In this way, we write N test harnesses, but get ~N*M total tests, M being average number of verification tests per functional test.
All test outputs are logged, including result, inputs, expected outputs, actual outputs, etc. These form just a part of future submission packages, along with traceability reports we can also generate from the spreadsheet.
All of this is handled with just one Google Doc spreadsheet and a few hundred lines of Python, and saves us oodles while catching tons of bugs. We've gotten to the point where any changes in the spec documents immediately triggers test failures, so we know that what we ship is what we actually designed. Additionally, all the reports generated by the tests are great V&V documentation for regulatory submissions.
In the future, the plan is to move from word docs + spreadsheets to a more complete QMS (JAMA + Jira come to mind), but at the stage we are at, this setup works very well for not that much cost.