caption_hashtags property for only-if evaluation
caption_hashtags is a list of all hashtags that are mentioned in the Post's caption. It allows to easily filter Posts that have multiple hashtags, and as such fixes #24. Further, the documentation of --only-if has been completed by linking to a description of the syntax in the Python documentation, and by linking to a list of all defined properties with their meanings. So, this commit also closes #42.
This commit is contained in:
parent
d84136b2dd
commit
5b5d540310
13
README.rst
13
README.rst
@ -129,8 +129,8 @@ Filter Posts
|
|||||||
The ``--only-if`` option allows to specify criterias that posts have to
|
The ``--only-if`` option allows to specify criterias that posts have to
|
||||||
meet to be downloaded. If not given, all posts are downloaded. It must
|
meet to be downloaded. If not given, all posts are downloaded. It must
|
||||||
be a boolean Python expression where the variables ``likes``,
|
be a boolean Python expression where the variables ``likes``,
|
||||||
``comments``, ``viewer_has_liked``, ``is_video``, ``date``, and some
|
``comments``, ``viewer_has_liked``, ``is_video``, and many
|
||||||
more (see class ``instaloader.Post`` for a full list) are defined.
|
more are defined.
|
||||||
|
|
||||||
A few examples:
|
A few examples:
|
||||||
|
|
||||||
@ -153,8 +153,17 @@ Or you may **skip videos**:
|
|||||||
|
|
||||||
instaloader --only-if="not is_video" target
|
instaloader --only-if="not is_video" target
|
||||||
|
|
||||||
|
Or you may filter by hashtags that occur in the Post's caption. For
|
||||||
|
example, to download posts of kittens that are cute: ::
|
||||||
|
|
||||||
|
instaloader --only-if="'cute' in caption_hashtags" "#kitten"
|
||||||
|
|
||||||
.. basic-usage-end
|
.. basic-usage-end
|
||||||
|
|
||||||
|
(For a more complete description of the ``-only-if`` option, refer to
|
||||||
|
the `Instaloader Documentation <https://instaloader.readthedocs.io/basic-usage.html#filter-posts>`__)
|
||||||
|
|
||||||
|
|
||||||
Advanced Options
|
Advanced Options
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
|
@ -16,6 +16,8 @@ Introduction
|
|||||||
:members:
|
:members:
|
||||||
:undoc-members:
|
:undoc-members:
|
||||||
|
|
||||||
|
.. _post-class:
|
||||||
|
|
||||||
``Post`` Class
|
``Post`` Class
|
||||||
^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
@ -7,3 +7,9 @@ Basic Usage
|
|||||||
.. include:: ../README.rst
|
.. include:: ../README.rst
|
||||||
:start-after: basic-usage-start
|
:start-after: basic-usage-start
|
||||||
:end-before: basic-usage-end
|
:end-before: basic-usage-end
|
||||||
|
|
||||||
|
.. (continuation of --only-if explanation)
|
||||||
|
|
||||||
|
The given string is evaluated as a
|
||||||
|
`Python boolean expression <https://docs.python.org/3/reference/expressions.html#boolean-operations>`__,
|
||||||
|
where all occuring variables are properties of the :ref:`post-class`.
|
||||||
|
@ -275,6 +275,16 @@ class Post:
|
|||||||
elif "caption" in self._node:
|
elif "caption" in self._node:
|
||||||
return self._node["caption"]
|
return self._node["caption"]
|
||||||
|
|
||||||
|
@property
|
||||||
|
def caption_hashtags(self) -> List[str]:
|
||||||
|
"""List of all hashtags (without preceeding #) which occur in the Post's caption."""
|
||||||
|
if not self.caption:
|
||||||
|
return []
|
||||||
|
# This regular expression is from jStassen, adjusted to use Python's \w to support Unicode
|
||||||
|
# http://blog.jstassen.com/2016/03/code-regex-for-instagram-username-and-hashtags/
|
||||||
|
hashtag_regex = re.compile(r"(?:#)(\w(?:(?:\w|(?:\.(?!\.))){0,28}(?:\w))?)")
|
||||||
|
return re.findall(hashtag_regex, self.caption)
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def is_video(self) -> bool:
|
def is_video(self) -> bool:
|
||||||
"""True if the Post is a video."""
|
"""True if the Post is a video."""
|
||||||
|
Loading…
x
Reference in New Issue
Block a user