Andy Crouch - Code, Technology & Obfuscation ...

Modern Data Exchange

Chimney Smoking In Country Setting

Photo: Unsplash

It’s 2017 and many of the partners that I deal with are still exposing their data via FTP. This makes me sad!

That (s)FTP is still a mainstream method of exchanging data demonstrates laziness on both sides of the connection.

From a supplier side, you are saying you do not trust your customers. Or you are saying your system is not designed with collaboration in mind. If trust is an issue and you are generating files for a customer to collect what is different to providing that data via an API endpoint? If you do not want customers accessing your application then you can segment an API. You are running an FTP server so why not use it to make the customer feel enabled?

(s)FTP is not as secure as Https. In the majority of cases that I have used it, (s)FTP is used to get files that are then processed. That means when processing you could suffer undesirable behaviour from the malicious code. You have no way of verifying that what was uploaded to the (s)FTP server has not been tampered with. (You are taking precautions right, I mean you are not storing data straight from an FTP file to your database?!!?) Using a secure, encrypted API endpoint reduces those risks.

From a customer side, why are we putting up with it? I am building modern, fast software. Why should I settle on working with company’s that insist on using such old (40 years) technology? Why should I have to carry the burden of checking a server for updated files? Why would I choose this over a company that provides mechanisms to notify me of changes (webhooks etc)? Why do I want to have to process files? I was exchanging XML back in 2005 and JSON simplifies things even more.

Here is a real example of a company that I deal with. They provide their data as CSV files. When we had the meeting to talk technical details they asked what version of the file I wanted. “Erm, isn’t there only one CSV file format?” You’d think! Turns out that the data is provided to utility company’s. They each insist that their data is formatted in a particular order, same data just different order. This has lead my partner to maintain over 90 versions of the same file. Each version contains the same data in just a different order. Now imagine how easy it would be to do this via an API?

If you have similar frustrations message me via twitter or email to vent.