Archiving a Twitter users timeline with Tweet-SQL 3.6


Warning: count(): Parameter must be an array or an object that implements Countable in /home/fbsqlcom/public_html/youdidwhatwithtsql.com/wp-content/plugins/wp-codebox/main.php on line 31

Here’s a quick update of a post I made way back in 2008 to archive a users timeline. This script will allow you download the tweets from any unprotected twitter account. Let’s get started!

First create the following table in your Tweet-SQL database.

CREATE TABLE TwitterArchive
(
	created_at NVARCHAR(30),
	id BIGINT PRIMARY KEY CLUSTERED,
	[text] NVARCHAR(140),
	source NVARCHAR(100),
	truncated NVARCHAR(5),
	in_reply_to_status_id BIGINT,
	in_reply_to_user_id BIGINT,
	favorited NVARCHAR(5),
	status_Id BIGINT,
	statuses_Id BIGINT
);

Just change @screen_name to the user whose timeline you want to archive and you’re good to go.

DECLARE @xml XML,
         @handle INT,
         @screen_name NVARCHAR(200),
         @since_id BIGINT,
         @optional NVARCHAR(50),
         @rowcount TINYINT;
 
  SET @screen_name = 'rhyscampbell';
  SET @since_id = 1;
 
  -- Turn of resultsets from Tweet-SQL
  EXEC dbo.tweet_cfg_resultset_send 0;
 
  -- Set the optional parameter to request page 1
  SET @optional = 'count=200&since_id=' + CAST(@since_id AS NVARCHAR(20));
  -- Get page 1 of your timeline
  EXEC dbo.tweet_sts_user_timeline @screen_name, @optional, @xml OUTPUT;
  SET @rowcount = 1;
 
  WHILE (@rowcount != 0) -- While we still have results to deal with
  BEGIN
      -- Prepare an xml document
      EXEC sp_xml_preparedocument @handle OUTPUT, @xml;
 
      -- Insert the page into a table
      INSERT INTO dbo.TwitterArchive
      (
          created_at,
          id,
          [text],
          source,
          truncated,
          in_reply_to_status_id,
          in_reply_to_user_id,
          favorited,
          status_Id,
          statuses_Id    
      )
      SELECT  created_at,
              id,
              [text],
              source,
              truncated,
              in_reply_to_status_id,
              in_reply_to_user_id,
              favorited,
              status_Id,
              statuses_Id
      FROM OPENXML (@handle, '/objects/objects', 2)
      WITH
      (
          created_at NVARCHAR(30),
          id BIGINT,
          [text] NVARCHAR(140),
          source NVARCHAR(100),
          truncated NVARCHAR(5),
          in_reply_to_status_id BIGINT,
          in_reply_to_user_id BIGINT,
          favorited NVARCHAR(5),
          status_Id BIGINT,
          statuses_Id BIGINT
      );
 
      -- Get the rowcount
      SET @rowcount = @@ROWCOUNT;
      -- destroy the xml document
      EXEC sp_xml_removedocument @handle;
      -- Increment the ssince_id
      SET @since_id = (SELECT MIN(id) FROM dbo.TwitterArchive) -1;
      -- Setup the optional parameter
      SET @optional = 'count=200&max_id=' + CAST(@since_id AS NVARCHAR(20));
      -- Wait for a bit...
      WAITFOR DELAY '00:00:05';
      -- Get the next page
      EXEC dbo.tweet_sts_user_timeline @screen_name, @optional, @xml OUTPUT;
  END 
  -- Turn resultsets from Tweet-SQL back on as appropriate
  EXEC dbo.tweet_cfg_resultset_send 1;

The twitter API documentation states up to 3200 tweets can be retrieved this way. I got an extra one for free!

rhyscampbell_twitter_archive

Leave a Reply