How We Protect Your Data Here at Pipedrive

Last week a subset of our customers experienced an unfortunate incident which removed some email contents that had been saved to Pipedrive, using the Smart Email BCC feature. The people affected rightfully raised concerns about our ability to protect customers from any kind of data loss.

In this post, I would like to start rebuilding the level of customer trust we have enjoyed until now. Yes, actions speak louder than words, but first allow me to detail our architecture and practices around customer data storage and protection, shed some light on the incident, and then share additional safeguards we are planning to adopt.

A secure, proven database technology

Most of our customers’ data is stored in MySQL databases. From the very beginning, there was a decision to keep each of our customers’ data in a separate database. MySQL technology provides a fairly easy and efficient way to keep multiple databases on the same server, which allows us to keep each customer’s data separate from the data of other customers, but still use server resources efficiently.

Automated failover mechanism

Servers can stop working unexpectedly. Our MySQL databases have real-time replica and an automated failover mechanism in case something happens to the master database. To keep the disruption and data loss to an absolute minimum, the system is set up in a way that if an issue with the master database is automatically detected, the machine is taken offline, and it is replaced with another machine that was doing real-time backups. This all happens automatically with no manual input needed from our engineers.

Some of you may have seen “maintenance” notices at various times, and in most cases such messages appear during a failover process to prevent users from changing data on a database experiencing issues.

Encrypted database snapshots

In addition to failover replica and mechanisms, we have a nightly backup process that takes snapshots of all customer databases, encrypts them, and stores them securely in a separate datacenter.

Taking snapshots is a good idea for a couple of reasons, the first being surgical data recovery. By contrast, real-time backups are only good for server failures, and it’s possible for the backup machine(s) to fail right after the main server fails. If a user accidentally bulk edits some data, these changes are also replicated in the backup machine.

But the snapshots taken on previous days still have the correct data from those days before the incident, so it’s possible to recover past data up to a recent version that is less than a day old – whatever happens to the master and backup databases – and minimize the data loss.

We keep these snapshots in a separate location to minimize the risk of natural disasters or other events taking out the entire data center in one geographical location. Our hosting partners are of course prepared to handle risks like that, but it makes sense not to rely on the stability of a single location.

Elasticsearch technology

In addition to customer databases, we have several other classes of derived data. One of them is the data search index. This index is updated each time data is added or modified in the customer database. We use Elasticsearch technology for powering our search indexes, which is distributed, scalable and highly available.

Code related precautions

Our engineering organization has a sophisticated way of doing code changes to prevent accidents. The process includes a group review of each task before development and a peer review before anything is released live.

We take similar care to our database backup processes outlined above with our code, which is versioned and backed up many times over.

Reliability of current Pipedrive systems and personnel

To explain the reliability of the system: If a customer causes an issue with their data accidentally, we can go back to a data snapshot 1, 2, 10 or up to 180 days ago and restore the data. We do hundreds of these data recoveries for our customers every year – and haven’t lost any data in six years of operation.

I myself have worked at Skype and banks that operate internationally and understand the importance of communications, data, security, and storage, which partly explains why I decided to write about the incident publicly on our blog.

More to the point, user data has been and will continue to be handled by a core team of five dedicated professionals. Collectively, they have a combined 51 years in data infrastructure experience, including extensive experience of critical production systems management in banking, government, and the telecommunications sectors, and are led by a PhD.

How the recent incident happened

Long ago, we took an architectural decision to start using Elasticsearch for storing incoming email bodies. At the time, it looked like a wise decision. Since then, we learned that the solution doesn’t scale to the extent we expected for email storage. We have been focused on rebuilding both of our email features – Smart BCC and Full Email Sync – for quite some time, and were happy to release the latter to the public in late July.

After rebuilding and adding new features, we had to clean up our Elasticsearch storage and migrate old data into a new solution. Unfortunately, there was a mistake in our cleanup procedure that affected old, unmigrated data of our customers, as well as search indexes of those customers.

I would like to sincerely apologize for the mistake made, and personally take all responsibility. I also wish to reassure Pipedrive users that:

  • This was a one-time event, unrelated to the usual day-to-day functionality of the product,
  • We have and will continue to make every effort to restore all lost data possible, and
  • We have learned our lesson to avoid any such incident in the future.

Improvements to reliability already in the works

As for next steps:

  • We are reviewing all of our data classifications and run through disaster recovery scenarios for each of them. The aim of these exercises is to prevent any future data loss, and to decrease the time it takes to recover data significantly, for each of the data classes, should an incident be unavoidable.
  • We are preparing to migrate to multiple datacenters hosting to reduce location-based risk, and for the additional benefit of running accounts from a hosting location geographically close to customers.
  • We are upgrading our storage system to Ceph, which is designed to provide excellent performance, reliability, and scalability.
  • Further, we will be growing the Infra team by one or two experienced data professionals, and are planning to add a central monitoring function, and in time, a full team, to boost our capacity to respond to technical and other potentially disruptive incidents.

Lessons learned in responding to this incident

We have learned a lot during this time, and want to share the following lessons:

  • Trust takes a long time to build and just one incident to break. We’re very aware of this, and we want to make things right.
  • We have a solid approach to reliability, but this wasn’t enough. We’ve learned an important lesson and are busy making our systems and processes even more robust.
  • Our people are the best asset we have, and we are completely proud with how professional, rational, and responsive everyone at Pipedrive has been under trying circumstances.
  • We’re lucky to have customers like you. We draw an incredible amount of energy, drive, and strength from the salespeople, entrepreneurs and dealmakers who use, promote and evangelize the product and the company.

We are especially grateful to customers who got in touch to share messages like these:

“Kudos on your customer transparency. I know this sort of email is never easy to write.”
Kevin M

“That is exactly how I would handle the situation. Too many people (and companies) think it’s ok to just say “I’m sorry” repetitively and do nothing. I’m impressed that Pipedrive is owning up to the responsibility and making it right. I’ll find a way to get past this but you’ve just earned my loyalty by that action. No excuses, no excessive apologies, just decisive action. ”
Ted L

“Even with the issues this week, I would say there is one strength in your company that never seems to waver, the communication with your customers. I really appreciate that and the overall customer service you provide.”
Melissa M

“Sh*t happens when you party naked… Still love Pipedrive”
Emil M

And we’re here if you ever need to get in touch…

Our customer support is available 9am-5pm in the US and Europe, Monday to Friday – feel free to get in touch regarding this issue or anything else.

All of our customer support specialists have been briefed on the incident, should you have any unanswered questions. We will communicate any updates to admins of affected customer accounts as they are confirmed.

Finally, on a personal note, if you have any feedback on this article, my response, or that of Pipedrive as a team, please do not hesitate to email me directly on sergei@pipedrive.com, and I will do my best to answer you and satisfy any and all concerns you may have.

In the meantime, thank you for accepting our sincere apologies, your collective patience, and understanding, and for the time you have taken to read this post in full.

Join Pipedrive CTA

Sergei Anikin

Sergei is the VP of Engineering at Pipedrive.

  • Graham

    I appreciate the straightforward explanation and actions to assure data integrity.

    • Martin Henk

      Thanks for your support, Graham.

  • Sjaak

    Haveing worked for software companies myself, I know incidents like this are your worst nightmare. The only right thing to do is be open about it, learn from it en take the right steps. And that’s what you guys do. Good luck!

    • Martin Henk

      Thanks! We appreciate the kind words.

  • Alex

    Sorry, your post explains only the underlying architecture and technology.
    Nevertheless, you cannot distract from having not done proper backups. If you had done proper backups, you would be able to restore the data. Yes, backups are costly. However, as a customer we assume that we pay a rate which includes proper data privacy and data security.

    If, like us, one has used pipedrive in serious customer communications with multiple users, the financial damage is immense. That is not a single excuse. We are doing serious business and our payments were meant at a serious purpose.

    It is time to get mature and take data security seriously! An excuse is nothing.

    But this is crux with today’s cloud providers: If anything happens, they say “Hey, nevermind. It were only 9$/user.” – So, no guarantees, no lever for getting compensation.

    • Martin Henk

      Hi Alex,
      You’re absolutely right. There’s really no excuse for this situation.
      I know it’s it’s too little too late, but we’re working hard on several fronts to remedy this data loss. There’s still hope.
      We’re definitely not taking this lightly. I can assure you everyone at Pipedrive values our customers and no one would ever say “hey, never mind the $9 per month customer”.
      We need to protect our business to be able to continue operating so compensating on potentially lost business is of course out of the question. That said, have you contacted support to ask about what we can do for you though?

  • malinkoapp

    I think the main issue with the search was that there was no real workaround. In our company that meant that the sales staff could not find contact details, add notes to deals etc… for days!
    What I would have wanted was a work-around, such as a way to list (maybe by first letter) all the companies. That would have at least meant we could use the software, albeit slower.

    • Martin Henk

      Hey,
      I’m really sorry about disrupting your work.

      I know it’s way too late for these workarounds and I really hope you’ll never need them again, but it’s possible to list organizations by their first letter in the Organizations list view in Pipedrive.
      And it’s also possible to create a filter and match by organization name. I know both workarounds are cumbersome and take much longer than simply searching for something, but it’s possible.

  • Nicole Javier

    Hello there Pipedrive Team,

    We’re sorry to hear about the incident happened upon the cleanup procedure on one of your servers. But we understand the situation as our previous emails was ‘accidentally’ included.

    Thanks for sending updates to us! Much appreciated your effort and action working to restore the deleted email body contents and continually fixing and improving your Pipedrive System.

    Look forward to hearing another set of favorable updates from you.

  • Jeff

    It was good to read this post, as a potential future customer. I appreciate the transparency, but I’ll be tough on you about security, that’s my background. This post is actually more about data resilience than data protection. The references to how data is actually secured are not well described, here or on the main site. In particular, “Advanced Security” is coming soon on the highest plan, when it should be the default on all plans. Having security as an optional inclusion based on plan size undermines the integrity of the security implementation across all users.

    Can you do another blog post that specifically talks about security rather than simple resilience? I’d like to know more about how you encrypt the data in the live MySQL databases; how you control access to encryption keys and manage roles (particularly across admin and developer levels); how you’ve implemented secure coding practices into your dev; how you perform security testing (bug bounties?); and how your MFA options are implemented. And most importantly, how prepared you are to respond to a targeted attack where the attacker is looking to lift your entire database and drop it on Pastebin – it’s happened to data companies much bigger than yours, and you don’t want to figure out your incident response plan at the time of the attack.

    I couldn’t find a responsible disclosure link anywhere on the site either, do you have a process? Keep up the good work, I might become a customer in future (depends on how the others respond to the same questions!)

    • Sergei Anikin

      Hi Jeff,

      you are right, this post addresses concerns regarding customer data loss we experienced. I would love to get on the security topic in the future for sure. Just some quick comments on your questions.

      “Advanced Security” coming soon on the highest plan actually means “Advanced user management” which is relevant to bigger teams and companies. We are going to change that in the plan description. We have no differences in the security level for different plans.

      In general our security efforts are headed towards SOC2 and ISO 27001 certifications, we use certification requirements as guidelines for our efforts.

      We don’t have a responsible disclosure link at the moment, but we run an invitation only program on HackerOne platform. Feel free to contact me directly for the invitation.

      Regards,
      Sergei

  • Kay Kreidler

    Hi Sergei, does Pipedrive locate the data in a US-american datacenter? If so, do you follow the frivacy shield framework? THX, Kay.