caption_hashtags property for only-if evaluation
caption_hashtags is a list of all hashtags that are mentioned in the Post's caption. It allows to easily filter Posts that have multiple hashtags, and as such fixes #24. Further, the documentation of --only-if has been completed by linking to a description of the syntax in the Python documentation, and by linking to a list of all defined properties with their meanings. So, this commit also closes #42.
This commit is contained in:
parent
d84136b2dd
commit
5b5d540310
13
README.rst
13
README.rst
@ -129,8 +129,8 @@ Filter Posts
|
||||
The ``--only-if`` option allows to specify criterias that posts have to
|
||||
meet to be downloaded. If not given, all posts are downloaded. It must
|
||||
be a boolean Python expression where the variables ``likes``,
|
||||
``comments``, ``viewer_has_liked``, ``is_video``, ``date``, and some
|
||||
more (see class ``instaloader.Post`` for a full list) are defined.
|
||||
``comments``, ``viewer_has_liked``, ``is_video``, and many
|
||||
more are defined.
|
||||
|
||||
A few examples:
|
||||
|
||||
@ -153,8 +153,17 @@ Or you may **skip videos**:
|
||||
|
||||
instaloader --only-if="not is_video" target
|
||||
|
||||
Or you may filter by hashtags that occur in the Post's caption. For
|
||||
example, to download posts of kittens that are cute: ::
|
||||
|
||||
instaloader --only-if="'cute' in caption_hashtags" "#kitten"
|
||||
|
||||
.. basic-usage-end
|
||||
|
||||
(For a more complete description of the ``-only-if`` option, refer to
|
||||
the `Instaloader Documentation <https://instaloader.readthedocs.io/basic-usage.html#filter-posts>`__)
|
||||
|
||||
|
||||
Advanced Options
|
||||
----------------
|
||||
|
||||
|
@ -16,6 +16,8 @@ Introduction
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. _post-class:
|
||||
|
||||
``Post`` Class
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
|
@ -7,3 +7,9 @@ Basic Usage
|
||||
.. include:: ../README.rst
|
||||
:start-after: basic-usage-start
|
||||
:end-before: basic-usage-end
|
||||
|
||||
.. (continuation of --only-if explanation)
|
||||
|
||||
The given string is evaluated as a
|
||||
`Python boolean expression <https://docs.python.org/3/reference/expressions.html#boolean-operations>`__,
|
||||
where all occuring variables are properties of the :ref:`post-class`.
|
||||
|
@ -275,6 +275,16 @@ class Post:
|
||||
elif "caption" in self._node:
|
||||
return self._node["caption"]
|
||||
|
||||
@property
|
||||
def caption_hashtags(self) -> List[str]:
|
||||
"""List of all hashtags (without preceeding #) which occur in the Post's caption."""
|
||||
if not self.caption:
|
||||
return []
|
||||
# This regular expression is from jStassen, adjusted to use Python's \w to support Unicode
|
||||
# http://blog.jstassen.com/2016/03/code-regex-for-instagram-username-and-hashtags/
|
||||
hashtag_regex = re.compile(r"(?:#)(\w(?:(?:\w|(?:\.(?!\.))){0,28}(?:\w))?)")
|
||||
return re.findall(hashtag_regex, self.caption)
|
||||
|
||||
@property
|
||||
def is_video(self) -> bool:
|
||||
"""True if the Post is a video."""
|
||||
|
Loading…
Reference in New Issue
Block a user