Java – grab the tagged instagram photos in real time
I'm trying to download photos with specific tags I found the real-time API rather useless, so I used a long polling strategy The following is pseudo code with comments for sublte errors
newMediaCount = getMediaCount(); delta = newMediaCount - mediaCount; if (delta > 0) { // if mediaCount changed by Now,realDelta > delta,so realDelta - delta photos won't be grabbed and on next poll if mediaCount didn't change again realDelta - delta would be duplicated else ... // if photo posted from private account last photo will be duplicated as counter changes but nothing is added to recent recentMedia = getRecentMedia(delta); // persist recentMedia mediaCount = newMediaCount; }
The second problem can be solved by some kind of me But first it really bothered me I've transferred two calls to the instagram API as close as possible, but is that enough?
edit
Because Amir suggested that I use min / max_ tag_ IDS rewrites the code But it still skips photos I can't find a better test method than saving images on disk for a period of time, and compare the results with instagram COM / explore / tags /
public class LousyInstagramApiTest { @Test public void testFeedContinuity() throws Exception { Instagram instagram = new Instagram(Settings.getClientId()); final String TAG_NAME = "portrait"; String id = instagram.getRecentMediaTags(TAG_NAME).getPagination().getMinTagId(); HashtagEndpoint endpoint = new HashtagEndpoint(instagram,TAG_NAME,id); for (int i = 0; i < 10; i++) { Thread.sleep(3000); endpoint.recentFeed().forEach(d -> { try { URL url = new URL(d.getImages().getLowResolution().getImageUrl()); BufferedImage img = ImageIO.read(url); ImageIO.write(img,"png",new File("D:\\tmp\\" + d.getId() + ".png")); } catch (Exception e) { e.printStackTrace(); } }); } } } class HashtagEndpoint { private final Instagram instagram; private final String hashtag; private String minTagId; public HashtagEndpoint(Instagram instagram,String hashtag,String minTagId) { this.instagram = instagram; this.hashtag = hashtag; this.minTagId = minTagId; } public List<MediaFeedData> recentFeed() throws InstagramException { TagMediaFeed Feed = instagram.getRecentMediaTags(hashtag,minTagId,null); List<MediaFeedData> dataList = Feed.getData(); if (dataList.size() == 0) return Collections.emptyList(); String maxTagId = Feed.getPagination().getNextMaxTagId(); if (maxTagId != null && maxTagId.compareTo(minTagId) > 0) dataList.addAll(paginateFeed(maxTagId)); Collections.reverse(dataList); // dataList.removeIf(d -> d.getId().compareTo(minTagId) < 0); minTagId = Feed.getPagination().getMinTagId(); return dataList; } private Collection<? extends MediaFeedData> paginateFeed(String maxTagId) throws InstagramException { System.out.println("pagination required"); List<MediaFeedData> dataList = new ArrayList<>(); do { TagMediaFeed Feed = instagram.getRecentMediaTags(hashtag,null,maxTagId); maxTagId = Feed.getPagination().getNextMaxTagId(); dataList.addAll(Feed.getData()); } while (maxTagId.compareTo(minTagId) > 0); return dataList; } }
Solution
Use tag endpoints to get the tag of the latest media, which will return a min in its paging information_ tag_ ID, this information is bound to the recently marked media during a call Because the API also accepts a min_ tag_ ID parameter, so you can pass the number in the last query to the media that only receive the marks since the last query
So based on any polling mechanism you have, you just need to call the API to get the latest media, if according to the recently received min_ tag_ If you don't mind
You also need to pass a large count parameter and receive all data according to the response page without losing any tags. It is faster than your polling speed
Update: according to your updated code:
public List<MediaFeedData> recentFeed() throws InstagramException { TagMediaFeed Feed = instagram.getRecentMediaTags(hashtag,100000); List<MediaFeedData> dataList = Feed.getData(); if (dataList.size() == 0) return Collections.emptyList(); // follow the pagination MediaFeed recentMediaNextPage = instagram.getRecentMediaNextPage(Feed.getPagination()); while (recentMediaNextPage.getPagination() != null) { dataList.addAll(recentMediaNextPage.getData()); recentMediaNextPage = instagram.getRecentMediaNextPage(recentMediaNextPage.getPagination()); } Collections.reverse(dataList); minTagId = Feed.getPagination().getMinTagId(); return dataList; }