wordpress-activitypub/includes/class-hashtag.php
Matthias Pfefferle b744dc551d
Comment Federation (#550)
* Comments 1

* Delete FUNDING.yml

* Add basic BuddyPress support

fix #122

thanks and props @skysarwer

* change URL to `bp_core_get_user_domain`

* fix "Follow" issue

fix #133

* fix #135

* version bump

* Create phpunit.yml

* Update composer.json

* Update composer.json

* Update phpunit.yml

* Update composer.json

* Create phpcs.yml

* Update phpcs.xml

* Update composer.json

* phpcs fixes

* fix typo

* Comments update

* webfinger_extract remove extra param

* coding standards

* Replies Collection, settings, other fixes

* Create stale.yml

* move stale file

* code standards cleanup

* Migrate / Update script

* bugfix

* add settings link to plugin page

* fix code standards

* fix cs

* fix PHPCS

* PHPCS fixes

* change background image for wp.org

* fix docker

* fix webfinger for email identifiers

fix #152

* version bump

* update composer file to fix unit testing

* allow plugins

* fix dependencies

* Migrate tools

* code cleanup

* regression fix

* Fix announce, clarified language

* update included filename

* code cleanup

* Improve migration UX

* Add comments view, warnings to migrate page

* style fix

* more style fixes

* Fix send_delete_activity

* replace ap_comment_id to reuse  replytocom var

* Comments class missing attributes

* Post class fix attributes

* move js file to assets/js

* Separate file for Comment processing hooks

* fix file path

* associate comments to back compat post

* Fix js assets enqueue

* change regex matching potential hashtags

Matches any string starting with '#' and consisting of any number and combination of [A-Za-z0-9_] that is directly followed by whitespace or punctuation. Groups everything after '#' for access in functions using this regex.

This fixes #183 (incomplete links on hashtags containing special characters) by not matching these at all.

* also detect hashtags at the start of a paragraph

* restrict html tags after which to detect a hashtag

Hashtags should not be detected after just any html tag - for example not after an opening a or div. To still allow detection at the start of a line, allow specifically p and br to directly precede a hashtag.

* fix pagination

* Add Custom Post Type support to outbox API

* remove comment_type

* fix comparison

* remove trailing spaces

* fix phpcs issues

* fix phpcs issues

* run phpcs also on pull_requests

* fix phpcs issues

* support threaded comments from ActivityPub

* refactor support for threaded comments from ActivityPub

* remove debugging log line

* add first unit tests for class inbox

* fix code smells

* make filter function static

* attempt to resolve backwards compatibility issues

* update js to new file

* delete old js

* Remove migrate code

* update post meta canonical

* remove type and mention meta from comment filters

* extract mentions from comment_content

* phpcbf

* remove extra curly bracket

* Remove migrate code

* remove version_check()

* Update enqueue scripts

* Remove remote comments from preprocessing

* Reply to comments from Dashboard

* rename function, inserts users into reply text

* Update dispatch comments

* update comment model

* fix comment model replies property

* fix preprocess_comment cap check

* Add webfinger filter to comments

* Add comment edit datetime

* cleanup

* fix var name

* cleanup

* phpcbf

* better actual translation support

* Separate comment reply script

* migrate dispatch, migrate comment model to transform

* ignore WP_Comment type for now

* Adds new helpers for resolving inReplyTo url

* Update activitypub_send_comment_activity to include type

* remove redundant id check

* reinclude user_id in saved ap_object meta

* update post field meta

* Fix comment updated datetime

* front-end reply inserts @mentions

* enqueue reply script on front end

* use const instead of dirname

* some simplifications

* move some functions

* fixes

* some more fixes

* fix namespace

* fix unittests

* fix testcase

* fixed typo

* fix tests

* fix tests

* fix PHPCS

* move functions to transformer class

* fix warnings

* Link remote comments on frontend

* Link to comment source as row action

* Init Comments class

* remove dead dispatch action

* re-add extract mentions filter

* Restore and tweak Comment transform

* Schedule comments activities for non-admin users

* lint

* remove context property

* rename get_id method to generate_id

* fix locale

* move functions

* PHPDoc

* this is never used

* remove some edit methods

* remove replies for now

* remove JS calls

* remove reply_recipients

* never used

* remove other query-vars

* otherwise to_json would not work properly

* small changes

* use `c` for comment IDs

* remove comments.php for now

maybe re-add it later

* wp_insert_post is an action

* also parse comment_text

* remove duplicate functions

* add Base transformer

* remove invalid test

* update to new query var

* update dispatcher to support comments and posts

* fix transition

* remove unused functions for now

* schedule_comment_activity seems to ignore create and update

* fix wrong use of functions!

* not every platforms sends an URL

* check source-id first

* remove hashtags for now

* fallback to ID

* fix typo

* move to_activity to Base class

* remove unused function

* add support for announce and like

* also ping inboxes of other commenters in the thread

* restructure WebFinger class

* some small improvements

* simplified to_object class

props @Menrath for the feedback and the idea!

* fix unit tests

* make transformer filterable

/cc @Menrath

* use transformer factory, so that transformer can be overwritten

* phpcs fixes

* fix attachments

* fix comment transformer

* remove comments for now

* update readme/changelog

* simplify and unify json_encodes

---------

Co-authored-by: Django Doucet <mediaformat.ux@gmail.com>
Co-authored-by: Andreas <andreas@bocops.de>
Co-authored-by: Eana Hufwe <eana@1a23.com>
Co-authored-by: Matthew Exon <git.mexon@spamgourmet.com>
Co-authored-by: Django Doucet <django.doucet@webdevstudios.com>
2023-12-22 10:12:26 +01:00

119 lines
3.2 KiB
PHP

<?php
namespace Activitypub;
/**
* ActivityPub Hashtag Class
*
* @author Matthias Pfefferle
*/
class Hashtag {
/**
* Initialize the class, registering WordPress hooks
*/
public static function init() {
if ( '1' === \get_option( 'activitypub_use_hashtags', '1' ) ) {
\add_action( 'wp_insert_post', array( self::class, 'insert_post' ), 10, 2 );
\add_filter( 'the_content', array( self::class, 'the_content' ), 10, 1 );
}
}
/**
* Filter to save #tags as real WordPress tags
*
* @param int $id the rev-id
* @param WP_Post $post the post
*
* @return
*/
public static function insert_post( $id, $post ) {
if ( \preg_match_all( '/' . ACTIVITYPUB_HASHTAGS_REGEXP . '/i', $post->post_content, $match ) ) {
$tags = \implode( ', ', $match[1] );
\wp_add_post_tags( $post->post_parent, $tags );
}
return $id;
}
/**
* Filter to replace the #tags in the content with links
*
* @param string $the_content the post-content
*
* @return string the filtered post-content
*/
public static function the_content( $the_content ) {
// small protection against execution timeouts: limit to 1 MB
if ( mb_strlen( $the_content ) > MB_IN_BYTES ) {
return $the_content;
}
$tag_stack = array();
$protected_tags = array(
'pre',
'code',
'textarea',
'style',
'a',
);
$content_with_links = '';
$in_protected_tag = false;
foreach ( wp_html_split( $the_content ) as $chunk ) {
if ( preg_match( '#^<!--[\s\S]*-->$#i', $chunk, $m ) ) {
$content_with_links .= $chunk;
continue;
}
if ( preg_match( '#^<(/)?([a-z-]+)\b[^>]*>$#i', $chunk, $m ) ) {
$tag = strtolower( $m[2] );
if ( '/' === $m[1] ) {
// Closing tag.
$i = array_search( $tag, $tag_stack, true );
// We can only remove the tag from the stack if it is in the stack.
if ( false !== $i ) {
$tag_stack = array_slice( $tag_stack, 0, $i );
}
} else {
// Opening tag, add it to the stack.
$tag_stack[] = $tag;
}
// If we're in a protected tag, the tag_stack contains at least one protected tag string.
// The protected tag state can only change when we encounter a start or end tag.
$in_protected_tag = array_intersect( $tag_stack, $protected_tags );
// Never inspect tags.
$content_with_links .= $chunk;
continue;
}
if ( $in_protected_tag ) {
// Don't inspect a chunk inside an inspected tag.
$content_with_links .= $chunk;
continue;
}
// Only reachable when there is no protected tag in the stack.
$content_with_links .= \preg_replace_callback( '/' . ACTIVITYPUB_HASHTAGS_REGEXP . '/i', array( '\Activitypub\Hashtag', 'replace_with_links' ), $chunk );
}
return $content_with_links;
}
/**
* A callback for preg_replace to build the term links
*
* @param array $result the preg_match results
* @return string the final string
*/
public static function replace_with_links( $result ) {
$tag = $result[1];
$tag_object = \get_term_by( 'name', $tag, 'post_tag' );
if ( $tag_object ) {
$link = \get_term_link( $tag_object, 'post_tag' );
return \sprintf( '<a rel="tag" class="hashtag u-tag u-category" href="%s">#%s</a>', $link, $tag );
}
return '#' . $tag;
}
}