Having worked in the same problem space, I can heavily recommend to get expert input when evaluating which features to use. Ideally, this is a person who knows the internals of the machinery and/or operations that can help you remove spurious features. As a Data Scientist, one sometimes tends to think that the data explains everything and no expert domain knowledge is needed ("Modern machine translation does work without any knowledge of grammar or language!"). Good luck!