Reduce the PDF file size of the drawing by filtering hidden objects

When generating a scatter plot of many points in R (for example, using ggplot ()), many points may lag behind other points and are not visible at all For example, see the following figure:

This is a scatter plot of hundreds of thousands of points, but most of them lag behind other points The problem is that when converting the output to a vector file (such as a PDF file), invisible points increase the file size and increase memory and CPU utilization when viewing the file

A simple solution is to convert the output to bitmap images (such as TIFF or PNG), but they lose vector quality and may be larger I tried some online PDF compressors, but the result was the same size as the original file

Is there any good solution? For example, some methods to filter invisible points may be by editing the PDF file during or after generating the drawing?

Solution

First, you can do this:

set.seed(42)
DF <- data.frame(x=x<-runif(1e6),y=x+rnorm(1e6,sd=0.1))
plot(y~x,data=DF,pch=".",cex=4)

Pdf size: 6334 KB

DF2 <- data.frame(x=round(DF$x,3),y=round(DF$y,3))
DF2 <- DF[!duplicated(DF2),]
nrow(DF2)
#[1] 373429
plot(y~x,data=DF2,cex=4)

Pdf size: 2373 KB

By rounding, you can control the number of values to delete You just need to modify it to handle different colors

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>