arkgil's comments

arkgil · on March 17, 2022

I'm sorry you find it distasteful - that was never our intention. What we found was that a user searching for a PDF generator, creator or writer are generally looking for the same solution - to create a PDF. So by repositioning our tool we were hoping to provide a better landing page experience for users that were searching for one of those specific keywords.

Of course the downside is as you've pointed out - it can be seen as distasteful and in some cases confusing to our users. We will review this on our side and see if it makes sense to remove some of those tools to reduce confusion.

arkgil · on March 17, 2022

Do you mean tables when converting HTML to PDF, or simply rendering the PDFs with tables in them?

mdellavo · on March 17, 2022

simply rendering tables - most of the (python) pdf generation libraries I evaluated a few years ago all had the same limitations (reflow is hard) around laying out large multipage tables. We went with a headless chrome service to print to pdf which did not have the limitation.

arkgil · on March 17, 2022

We've had customers in beta trying it out with multi-page tables and we've heard positive feedback.

arkgil · on March 17, 2022

Thanks for the feedback! For those that need larger volume, we also have an on-premise product: https://pspdfkit.com/api/documentation/deployment-options/.

arkgil · on March 17, 2022

Our engine is based on Google's PDFium, which is Apache licensed. We use it for rendering and reading the PDF object tree. Editing, annotations, etc. are all built on top of that.

arkgil · on March 17, 2022

With an annual plan of 10,000 documents per month, we charge $0.04 per document. If you try it out, I'd love to hear your feedback on the quality of OCR!

arkgil · on March 17, 2022

That's a great point. For folks that have strong privacy needs, we do have an on-premise product that provides the same functionality [1].

[1] https://pspdfkit.com/server/processor/

gyulai · on March 17, 2022

So what exactly does that leave? A wrapper that you've created around weasyprint, pandoc, latex, ghostscript, imagemagick, and stuff like that?

Sounds to me like an unnecessary extra expense for an unnecessary extra layer of abstraction. And there's a risk factor that comes with it: Say I make a nontrivial investment, like write a book that I'm planning on typesetting with this, or write a reporting infrastructure that creates automated reports or something. I'll make a huge up-front investment there that is tied to your API. Then I want to run this, while not touching it, for 10 years so it can earn a return on investment.

Then I come back to it 10 years later, because I'm writing the second edition of the book, or I want to change something about my reporting infrastructure. Has your company gone out of business in the meantime? Have you deprecated the product? Do you still support the API from 10 years ago? Does it still produce the same output for the same input? ...or do I need to take a huge write-off on all the work I've done on the typesetting my book or hooking up my reporting infrastructure?

In the open source world, I'd just make sure to bundle all the tools I'm using, including their sourcecode, in a docker container or something. In the "10 years later" scenario, I'll probably need to touch only the book's sourecode, or the reporting infrastructure's sourcecode, not the typesetting infrastructure. And if there's something I really really need, then I can go to the source and change it.

cloud8421 · on March 17, 2022

You’re touching on a few different points so I’ll try to cover everything.

- We do build on top OSS (just not those programs you listed - see https://pspdfkit.com/legal/acknowledgements/processor-acknow... for a complete list). The layer we build is quite large though, and it would take many person-years to replicate in its entirety. It’s possible though that you don’t need that at all and a focused program that wraps other ones might do the trick for your use case.

- If you build a product based on our tech, you’re taking a conscious decision about risk: while I do think we’re gonna be in business in 10 years (we have solid revenue and last year we got backed by a large investor, Insight), that we would version APIs and support you (not just during upgrades), the reality is that it is indeed possible that we’re not gonna be around anymore, like every other company on the planet. As a consumer, this is the reality for most of the things we buy nowadays. We do take deprecation seriously, as sell SDKs, and I’m sure in case of the company shutting down you would have enough time to migrate.

- Depending on what you need to build, using our product may shortcut your development time by a large factor. It may not, if you just need to rotate pages of a PDF document and there’s a reliable OSS package that does that in your language of choice. It really depends on what you need to do.

- Even if you package everything with OSS, waiting 10 years is a sufficiently large amount of time that it may not work and you have to fork and rebuild yourself. It’s a different type of risk, but still a risk. 10 years ago Docker had just been launched. Whether you build something on OSS or commercial, you would wanna test things once a year to see if they still work or keep up with security and bug fixes.

Ultimately, there are situations where the approach you described is sound: for example, I do my taxes in plain text accounting, using ledger and emacs. I generate the reporting via a couple of Ruby scripts. I do that exactly because I care about longevity: I do my taxes once a year, I don’t wanna spend time fixing the toolchain every time I have to do them. Yet every year I hit a couple of snags I have to fix, but I consider that acceptable.

cinntaile · on March 17, 2022

It's unclear what you're trying to say. They've been around since 2010 and they have quite a large team, why would they suddenly disappear? Also what do you want them to do?

gyulai · on March 17, 2022

What I wanted to say was: "PDF processing is something where I wouldn't want to rely on an online API over something local. And it's also something where I wouldn't want to rely on a small commercial company over an open source project".

I once worked for a software company where non-tech clients would have custom-made software developed for their exclusive use. Half the projects we did were "We're relying for one of our business functions on this software that we bought from this company that's now out of business. We need you to reimplement it from scratch, because we need a tiny change."

cinntaile · on March 17, 2022

To me it makes sense to buy this functionality instead of building it yourself, the upfront cost involved with building it yourself will likely be much higher even if you manage to chain together a bunch of open source tools.

matchagaucho · on March 17, 2022

Then I come back to it 10 years later, because I'm writing the second edition of the book, or I want to change something about my reporting infrastructure.

Those seem like one-off PDF conversion use cases that MS Word or Acrobat can easily handle. Not a high-volume, daily PDF invoice use case.

arkgil · on March 17, 2022

Signing is something we'd like to explore, we often hear from folks who'd want to simplify their signing workflows. Thanks for the feedback!

arkgil · on April 10, 2017

Thank you for the feedback! Looking at all the comments here, we will definitely look into improving documentation.

As for network traffic, we use net_kernel module which allows to track throughput between nodes in the cluster. It doesn't rely on specifics you system.