Don't peek at results. Checking results daily creates bias. You see variant is up 12% after 3 days, get excited, want to declare winner. But daily variation is normal. What looks like a 12% win after 3 days might be a 2% loss after 14 days. Set a calendar reminder for when the test reaches required sample size, then check results once.
If you absolutely must check (for example, to catch implementation errors), check weekly maximum and set a strict "no decisions until completion" rule. Looking is fine, acting on what you see is not.
Don't stop tests early based on results. "Variant is losing after 5 days, let's stop the test." Unless you set pre-defined stopping rules (variant worse by >20% at 50% sample size), run the full test. Early results are noise. Only exception: technical problems (page broken, tracking failed) warrant stopping.
Don't change tests mid-run. "Variant isn't winning, let's tweak the headline slightly." Now you've invalidated the test. Results are meaningless because you don't know which version drove outcomes. If you think of an improvement mid-test, document it as the next test, don't change the current one.
Don't add post-hoc success metrics. "Conversion didn't improve but scroll depth did, so it's a success." No. You defined primary metric before testing. Stick to it. You can note secondary metrics for learning, but don't retroactively redefine success.
Don't ignore statistical significance. "Variant is up 8% but only 90% confidence, close enough." Either set your threshold at 90% before testing, or wait for 95%. Don't change standards based on what's convenient.
The discipline: write down your test plan (hypothesis, success criteria, sample size, duration) before starting. When you want to stop early or change something, re-read your test plan. Follow it.