Join the Community
and take part in the story

OIO,OPENIO,account,0 / raise MasterNotFoundError


#1

Hello,
This message keeps on showing on every node.

Mar 16 10:17:35 oio-2 OIO,OPENIO,account,0[12888]: 12888 7F4C1C900EB0 log ERROR ERROR Unhandled exception in request
                                                   Traceback (most recent call last):
                                                     File "/usr/lib/python2.7/dist-packages/oio/common/wsgi.py", line 107, in dispatch_request
                                                       resp = getattr(self, 'on_' + endpoint)(req)
                                                     File "/usr/lib/python2.7/dist-packages/oio/account/server.py", line 56, in on_status
                                                       status = self.backend.status()
                                                     File "/usr/lib/python2.7/dist-packages/oio/account/backend.py", line 422, in status
                                                       account_count = conn.hlen('accounts:')
                                                     File "/usr/lib/python2.7/dist-packages/redis/client.py", line 1879, in hlen
                                                       return self.execute_command('HLEN', name)
                                                     File "/usr/lib/python2.7/dist-packages/redis/client.py", line 578, in execute_command
                                                       connection.send_command(*args)
                                                     File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 563, in send_command
                                                       self.send_packed_command(self.pack_command(*args))
                                                     File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 538, in send_packed_command
                                                       self.connect()
                                                     File "/usr/lib/python2.7/dist-packages/redis/sentinel.py", line 44, in connect
                                                       self.connect_to(self.connection_pool.get_master_address())
                                                     File "/usr/lib/python2.7/dist-packages/redis/sentinel.py", line 100, in get_master_address
                                                       self.service_name)
                                                     File "/usr/lib/python2.7/dist-packages/redis/sentinel.py", line 222, in discover_master
                                                       raise MasterNotFoundError("No master found for %r" % (service_name,))
                                                   MasterNotFoundError: No master found for 'OPENIO-master-1'

But everything is up at gridinit_cmd status. The first node is usually the master node, isn’t it?

Moreover, what’s the role of ‘account’ ?

Best regards,
Yongsheng


#2

/var/log/oio/sds/OPENIO/rawx-11/rawx-11-httpd-errors.log

[Fri Mar 16 06:25:01.720470 2018] [mpm_worker:notice] [pid 12582:tid 140440182929280] AH00297: SIGUSR1 received.  Doing graceful restart
[Fri Mar 16 06:25:01.781426 2018] [core:error] [pid 12582:tid 140440182929280] (EAI 2)Name or service not known: AH00549: Failed to resolve server name for 192.168.2.22 (check DNS) -- or specify an explicit ServerName
[Fri Mar 16 06:25:01.782360 2018] [mpm_worker:notice] [pid 12582:tid 140440182929280] AH00292: Apache/2.4.18 (Ubuntu) configured -- resuming normal operations
[Fri Mar 16 06:25:01.782375 2018] [core:notice] [pid 12582:tid 140440182929280] AH00094: Command line: '/usr/sbin/apache2 -D FOREGROUND -f /etc/oio/sds/OPENIO/rawx-11/rawx-11-httpd.conf'

Is it necessary to assign hostnames in /etc/hosts or in dns ?


#3

i can’t create account. it reposts

Unmanaged error: No master found for 'OPENIO-master-1' (HTTP 500)


#4

Hello @yongsheng, looks like the Redis cluster isn’t in good shape, either because it wasn’t properly configured, or because something went wrong with the nodes.

Could you please give me the output on each node of:

grep slaveof /etc/oio/sds/OPENIO/redis-*/redis.conf

As well as the contents of the openiosds::redis block in the puppet file used on each node.

Also, the output of tail -50 /var/log/oio/sds/OPENIO/redis-*/redis-*.log on each node might help, could you please send it over also?


#5

Hello vladimir,

Three nodes;
192.168.2.21 ( oio-1 )

root@oio-1:~# grep slaveof /etc/oio/sds/OPENIO/redis-*/redis.conf

openiosds::redis {'redis-0':
  ns        => 'OPENIO',
  ipaddress => $ipaddr,
}


root@oio-1:~# tail -50 /var/log/oio/sds/OPENIO/redis-*/redis-*.log
1172:S 16 Mar 17:06:26.894 * Retrying with SYNC...
1172:S 16 Mar 17:06:26.894 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1172:S 16 Mar 17:06:27.893 * Connecting to MASTER 192.168.2.22:6011
1172:S 16 Mar 17:06:27.893 * MASTER <-> SLAVE sync started
1172:S 16 Mar 17:06:27.893 * Non blocking connect for SYNC fired the event.
1172:S 16 Mar 17:06:27.893 * Master replied to PING, replication can continue...
1172:S 16 Mar 17:06:27.894 * Partial resynchronization not possible (no cached master)
1172:S 16 Mar 17:06:27.894 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1172:S 16 Mar 17:06:27.894 * Retrying with SYNC...
1172:S 16 Mar 17:06:27.894 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1172:S 16 Mar 17:06:28.895 * Connecting to MASTER 192.168.2.22:6011
1172:S 16 Mar 17:06:28.895 * MASTER <-> SLAVE sync started
1172:S 16 Mar 17:06:28.895 * Non blocking connect for SYNC fired the event.
1172:S 16 Mar 17:06:28.896 * Master replied to PING, replication can continue...
1172:S 16 Mar 17:06:28.896 * Partial resynchronization not possible (no cached master)
1172:S 16 Mar 17:06:28.896 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1172:S 16 Mar 17:06:28.896 * Retrying with SYNC...
1172:S 16 Mar 17:06:28.897 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1172:S 16 Mar 17:06:29.897 * Connecting to MASTER 192.168.2.22:6011
1172:S 16 Mar 17:06:29.897 * MASTER <-> SLAVE sync started
1172:S 16 Mar 17:06:29.897 * Non blocking connect for SYNC fired the event.
1172:S 16 Mar 17:06:29.897 * Master replied to PING, replication can continue...
1172:S 16 Mar 17:06:29.898 * Partial resynchronization not possible (no cached master)
1172:S 16 Mar 17:06:29.898 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1172:S 16 Mar 17:06:29.898 * Retrying with SYNC...
1172:S 16 Mar 17:06:29.898 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1172:S 16 Mar 17:06:30.899 * Connecting to MASTER 192.168.2.22:6011
1172:S 16 Mar 17:06:30.900 * MASTER <-> SLAVE sync started
1172:S 16 Mar 17:06:30.900 * Non blocking connect for SYNC fired the event.
1172:S 16 Mar 17:06:30.900 * Master replied to PING, replication can continue...
1172:S 16 Mar 17:06:30.901 * Partial resynchronization not possible (no cached master)
1172:S 16 Mar 17:06:30.901 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1172:S 16 Mar 17:06:30.901 * Retrying with SYNC...
1172:S 16 Mar 17:06:30.901 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1172:S 16 Mar 17:06:31.904 * Connecting to MASTER 192.168.2.22:6011
1172:S 16 Mar 17:06:31.904 * MASTER <-> SLAVE sync started
1172:S 16 Mar 17:06:31.905 * Non blocking connect for SYNC fired the event.
1172:S 16 Mar 17:06:31.905 * Master replied to PING, replication can continue...
1172:S 16 Mar 17:06:31.905 * Partial resynchronization not possible (no cached master)
1172:S 16 Mar 17:06:31.905 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1172:S 16 Mar 17:06:31.905 * Retrying with SYNC...
1172:S 16 Mar 17:06:31.905 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1172:S 16 Mar 17:06:32.908 * Connecting to MASTER 192.168.2.22:6011
1172:S 16 Mar 17:06:32.909 * MASTER <-> SLAVE sync started
1172:S 16 Mar 17:06:32.909 * Non blocking connect for SYNC fired the event.
1172:S 16 Mar 17:06:32.909 * Master replied to PING, replication can continue...
1172:S 16 Mar 17:06:32.909 * Partial resynchronization not possible (no cached master)
1172:S 16 Mar 17:06:32.909 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1172:S 16 Mar 17:06:32.909 * Retrying with SYNC...
1172:S 16 Mar 17:06:32.909 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
root@oio-1:~# 
root@oio-1:~# 

192.168.2.22 ( oio-2 )

root@oio-2:~# grep slaveof /etc/oio/sds/OPENIO/redis-*/redis.conf
slaveof 192.168.2.21 6011

openiosds::redis {'redis-0':
  ns        => 'OPENIO',
  ipaddress => $ipaddr,
  slaveof   => '192.168.2.21 6011',
}


root@oio-2:~# tail -50 /var/log/oio/sds/OPENIO/redis-*/redis-*.log
12588:S 16 Mar 17:08:10.026 * Retrying with SYNC...
12588:S 16 Mar 17:08:10.026 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
12588:S 16 Mar 17:08:11.026 * Connecting to MASTER 192.168.2.21:6011
12588:S 16 Mar 17:08:11.026 * MASTER <-> SLAVE sync started
12588:S 16 Mar 17:08:11.027 * Non blocking connect for SYNC fired the event.
12588:S 16 Mar 17:08:11.027 * Master replied to PING, replication can continue...
12588:S 16 Mar 17:08:11.027 * Partial resynchronization not possible (no cached master)
12588:S 16 Mar 17:08:11.027 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
12588:S 16 Mar 17:08:11.027 * Retrying with SYNC...
12588:S 16 Mar 17:08:11.028 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
12588:S 16 Mar 17:08:12.028 * Connecting to MASTER 192.168.2.21:6011
12588:S 16 Mar 17:08:12.029 * MASTER <-> SLAVE sync started
12588:S 16 Mar 17:08:12.029 * Non blocking connect for SYNC fired the event.
12588:S 16 Mar 17:08:12.029 * Master replied to PING, replication can continue...
12588:S 16 Mar 17:08:12.029 * Partial resynchronization not possible (no cached master)
12588:S 16 Mar 17:08:12.029 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
12588:S 16 Mar 17:08:12.029 * Retrying with SYNC...
12588:S 16 Mar 17:08:12.030 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
12588:S 16 Mar 17:08:13.031 * Connecting to MASTER 192.168.2.21:6011
12588:S 16 Mar 17:08:13.032 * MASTER <-> SLAVE sync started
12588:S 16 Mar 17:08:13.032 * Non blocking connect for SYNC fired the event.
12588:S 16 Mar 17:08:13.032 * Master replied to PING, replication can continue...
12588:S 16 Mar 17:08:13.032 * Partial resynchronization not possible (no cached master)
12588:S 16 Mar 17:08:13.033 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
12588:S 16 Mar 17:08:13.033 * Retrying with SYNC...
12588:S 16 Mar 17:08:13.033 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
12588:S 16 Mar 17:08:14.035 * Connecting to MASTER 192.168.2.21:6011
12588:S 16 Mar 17:08:14.035 * MASTER <-> SLAVE sync started
12588:S 16 Mar 17:08:14.035 * Non blocking connect for SYNC fired the event.
12588:S 16 Mar 17:08:14.036 * Master replied to PING, replication can continue...
12588:S 16 Mar 17:08:14.036 * Partial resynchronization not possible (no cached master)
12588:S 16 Mar 17:08:14.036 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
12588:S 16 Mar 17:08:14.036 * Retrying with SYNC...
12588:S 16 Mar 17:08:14.036 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
12588:S 16 Mar 17:08:15.037 * Connecting to MASTER 192.168.2.21:6011
12588:S 16 Mar 17:08:15.037 * MASTER <-> SLAVE sync started
12588:S 16 Mar 17:08:15.037 * Non blocking connect for SYNC fired the event.
12588:S 16 Mar 17:08:15.038 * Master replied to PING, replication can continue...
12588:S 16 Mar 17:08:15.038 * Partial resynchronization not possible (no cached master)
12588:S 16 Mar 17:08:15.038 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
12588:S 16 Mar 17:08:15.038 * Retrying with SYNC...
12588:S 16 Mar 17:08:15.038 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
12588:S 16 Mar 17:08:16.039 * Connecting to MASTER 192.168.2.21:6011
12588:S 16 Mar 17:08:16.039 * MASTER <-> SLAVE sync started
12588:S 16 Mar 17:08:16.040 * Non blocking connect for SYNC fired the event.
12588:S 16 Mar 17:08:16.040 * Master replied to PING, replication can continue...
12588:S 16 Mar 17:08:16.040 * Partial resynchronization not possible (no cached master)
12588:S 16 Mar 17:08:16.040 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
12588:S 16 Mar 17:08:16.040 * Retrying with SYNC...
12588:S 16 Mar 17:08:16.040 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
root@oio-2:~# 
root@oio-2:~# 

192.168.2.23 ( oio-3 )

root@oio-3:~# grep slaveof /etc/oio/sds/OPENIO/redis-*/redis.conf
slaveof 192.168.2.21 6011

openiosds::redis {'redis-0':
  ns        => 'OPENIO',
  ipaddress => $ipaddr,
  slaveof   => '192.168.2.21 6011',
}

root@oio-3:~# tail -50 /var/log/oio/sds/OPENIO/redis-*/redis-*.log
1186:S 16 Mar 17:09:16.340 * Retrying with SYNC...
1186:S 16 Mar 17:09:16.340 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1186:S 16 Mar 17:09:17.340 * Connecting to MASTER 192.168.2.21:6011
1186:S 16 Mar 17:09:17.340 * MASTER <-> SLAVE sync started
1186:S 16 Mar 17:09:17.341 * Non blocking connect for SYNC fired the event.
1186:S 16 Mar 17:09:17.341 * Master replied to PING, replication can continue...
1186:S 16 Mar 17:09:17.341 * Partial resynchronization not possible (no cached master)
1186:S 16 Mar 17:09:17.341 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1186:S 16 Mar 17:09:17.341 * Retrying with SYNC...
1186:S 16 Mar 17:09:17.341 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1186:S 16 Mar 17:09:18.343 * Connecting to MASTER 192.168.2.21:6011
1186:S 16 Mar 17:09:18.343 * MASTER <-> SLAVE sync started
1186:S 16 Mar 17:09:18.344 * Non blocking connect for SYNC fired the event.
1186:S 16 Mar 17:09:18.344 * Master replied to PING, replication can continue...
1186:S 16 Mar 17:09:18.344 * Partial resynchronization not possible (no cached master)
1186:S 16 Mar 17:09:18.344 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1186:S 16 Mar 17:09:18.344 * Retrying with SYNC...
1186:S 16 Mar 17:09:18.344 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1186:S 16 Mar 17:09:19.346 * Connecting to MASTER 192.168.2.21:6011
1186:S 16 Mar 17:09:19.346 * MASTER <-> SLAVE sync started
1186:S 16 Mar 17:09:19.346 * Non blocking connect for SYNC fired the event.
1186:S 16 Mar 17:09:19.346 * Master replied to PING, replication can continue...
1186:S 16 Mar 17:09:19.347 * Partial resynchronization not possible (no cached master)
1186:S 16 Mar 17:09:19.347 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1186:S 16 Mar 17:09:19.347 * Retrying with SYNC...
1186:S 16 Mar 17:09:19.347 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1186:S 16 Mar 17:09:20.348 * Connecting to MASTER 192.168.2.21:6011
1186:S 16 Mar 17:09:20.348 * MASTER <-> SLAVE sync started
1186:S 16 Mar 17:09:20.348 * Non blocking connect for SYNC fired the event.
1186:S 16 Mar 17:09:20.348 * Master replied to PING, replication can continue...
1186:S 16 Mar 17:09:20.348 * Partial resynchronization not possible (no cached master)
1186:S 16 Mar 17:09:20.349 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1186:S 16 Mar 17:09:20.349 * Retrying with SYNC...
1186:S 16 Mar 17:09:20.349 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1186:S 16 Mar 17:09:21.351 * Connecting to MASTER 192.168.2.21:6011
1186:S 16 Mar 17:09:21.351 * MASTER <-> SLAVE sync started
1186:S 16 Mar 17:09:21.351 * Non blocking connect for SYNC fired the event.
1186:S 16 Mar 17:09:21.352 * Master replied to PING, replication can continue...
1186:S 16 Mar 17:09:21.352 * Partial resynchronization not possible (no cached master)
1186:S 16 Mar 17:09:21.352 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1186:S 16 Mar 17:09:21.352 * Retrying with SYNC...
1186:S 16 Mar 17:09:21.352 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
1186:S 16 Mar 17:09:22.354 * Connecting to MASTER 192.168.2.21:6011
1186:S 16 Mar 17:09:22.354 * MASTER <-> SLAVE sync started
1186:S 16 Mar 17:09:22.354 * Non blocking connect for SYNC fired the event.
1186:S 16 Mar 17:09:22.354 * Master replied to PING, replication can continue...
1186:S 16 Mar 17:09:22.354 * Partial resynchronization not possible (no cached master)
1186:S 16 Mar 17:09:22.355 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
1186:S 16 Mar 17:09:22.355 * Retrying with SYNC...
1186:S 16 Mar 17:09:22.355 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
root@oio-3:~# 
root@oio-3:~#

#6

The first node is thinking the second node is a Master ?


#7

Could you give me the output of the block: openiosds::redissentinel in the puppet file of node1, aswell as the output of cat /etc/oio/sds/OPENIO/redissentinel-*/redis-sentinel.conf on all nodes.

I believe indeed that there was a problem somewhere, and that the redis on node1 doesn’t quite realise it is the master.


#8
openiosds::redissentinel {'redissentinel-0':
  ns          => 'OPENIO',
  ipaddress   => $ipaddr,
  master_name => 'OPENIO-master-1',
  redis_host  => "192.168.2.21",
}


root@oio-1:~# cat /etc/oio/sds/OPENIO/redissentinel-*/redis-sentinel.conf
bind 192.168.2.21
port 6012
dir "/var/lib/oio/sds/OPENIO/redissentinel-0"
daemonize no
pidfile "/var/lib/oio/sds/OPENIO/redissentinel-0/redissentinel-0.pid"
logfile "/var/log/oio/sds/OPENIO/redissentinel-0/redissentinel-0.log"

sentinel monitor OPENIO-master-1 192.168.2.22 6011 2
sentinel down-after-milliseconds OPENIO-master-1 1000
sentinel config-epoch OPENIO-master-1 1
sentinel leader-epoch OPENIO-master-1 703
# Generated by CONFIG REWRITE
sentinel known-slave OPENIO-master-1 192.168.2.21 6011
sentinel known-slave OPENIO-master-1 192.168.2.23 6011
sentinel known-sentinel OPENIO-master-1 192.168.2.23 6012 c7aedc62be20751f92f7228c154b1f21c5a94b6b
sentinel known-sentinel OPENIO-master-1 192.168.2.22 6012 6fc30b65c4128f4d1dc8ded3c1dfeb2d23a7d545
sentinel current-epoch 703


root@oio-2:~# cat /etc/oio/sds/OPENIO/redissentinel-*/redis-sentinel.conf
bind 192.168.2.22
port 6012
dir "/var/lib/oio/sds/OPENIO/redissentinel-0"
daemonize no
pidfile "/var/lib/oio/sds/OPENIO/redissentinel-0/redissentinel-0.pid"
logfile "/var/log/oio/sds/OPENIO/redissentinel-0/redissentinel-0.log"

sentinel monitor OPENIO-master-1 192.168.2.22 6011 2
sentinel down-after-milliseconds OPENIO-master-1 1000
sentinel config-epoch OPENIO-master-1 1
sentinel leader-epoch OPENIO-master-1 704
# Generated by CONFIG REWRITE
sentinel known-slave OPENIO-master-1 192.168.2.21 6011
sentinel known-sentinel OPENIO-master-1 192.168.2.23 6012 c7aedc62be20751f92f7228c154b1f21c5a94b6b
sentinel known-sentinel OPENIO-master-1 192.168.2.21 6012 f64eef38b4dab253201a02040f02009559dacc15
sentinel current-epoch 704


root@oio-3:~# cat /etc/oio/sds/OPENIO/redissentinel-*/redis-sentinel.conf
bind 192.168.2.23
port 6012
dir "/var/lib/oio/sds/OPENIO/redissentinel-0"
daemonize no
pidfile "/var/lib/oio/sds/OPENIO/redissentinel-0/redissentinel-0.pid"
logfile "/var/log/oio/sds/OPENIO/redissentinel-0/redissentinel-0.log"

sentinel monitor OPENIO-master-1 192.168.2.22 6011 2
sentinel down-after-milliseconds OPENIO-master-1 1000
sentinel config-epoch OPENIO-master-1 1
sentinel leader-epoch OPENIO-master-1 704
# Generated by CONFIG REWRITE
sentinel known-slave OPENIO-master-1 192.168.2.21 6011
sentinel known-sentinel OPENIO-master-1 192.168.2.22 6012 6fc30b65c4128f4d1dc8ded3c1dfeb2d23a7d545
sentinel known-sentinel OPENIO-master-1 192.168.2.21 6012 f64eef38b4dab253201a02040f02009559dacc15
sentinel current-epoch 704

#9

Hello,

I hope to know if this is an issue on version control or somewhere else and when it can be fixed. It happened in ubuntu 16. There’s no this issue in Centos 7 .


#10

Hello again @yongsheng, this is not an issue with version control, but with Redis service master assignment, which somehow went wrong the first time on your Ubuntu setup, but went well the second time.

When it happens you can fix it manually by:

  • by removing all lines containing sentinel known-slave from redis_sentinel configuration on hosts 2 and 3
  • replacing sentinel known-slave OPENIO-master-1 192.168.2.21 6011 by sentinel known-slave OPENIO-master-1 192.168.2.22 6011 on host1
  • restarting redis and redis_sentinels services on all machines using gridinit_cmd restart @redissentinel @redis

#11

Yes, I did sth wrong at first time. I will take a try on your suggestion and let you know if it works.


#12

Nice. It’s gone. I can create/show account now.

BTW, I saw this message keeps on showing up at each node which has rawx added:

Mar 20 11:24:02 oio-3 OIO,OPENIO,oio-event-agent,0[1851]: 1851 7FAE5CA64A50 log WARNING event 1 handling failure (release with delay): rdir update error: No rdir assigned to volume 192.168.2.23:6052

I didn’t assign rdir when configuring rawx. What’s the role of rdir?


#13

The RDIR (or reverse directory) is a complementary service to rawx service that stores chunk references of one or multiple rawx services in order to easily reconstruct data after an incident.

Each time you add rawx services you should perform an openio volume admin bootstrap --oio-ns OPENIO. Try it, it should fix your problem.


#14

I got this message when applying volume admin bootstrap --oio-ns OPENIO :

No META1: Backend error: META0 partially missing (HTTP 500) (STATUS 500)

It might also be caused by sth wrong at first time applying puppet. Everything else from status and list commands looks ok and all of them are up.


#15

This is usually caused by a failed directory bootstrap. Did you do the following command: openio directory bootstrap --oio-ns OPENIO --replicas 3 after applying the puppet files? While you’re at it, please check that you have done the zk-bootstrap aswell.

Basically, just check that all instructions in the documentation have been followed up from this point: http://docs.openio.io/17.04/install-guide-centos/installation.html#initialize-openio-namespace.